NUMBER 16 2009
The National Health and Nutrition Examination Surveys (NHANES) Volatile Organic Compound Dataset: An Introduction to the Project and Analyses of the Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics
A Collaborative Project of The Mickey Leland National Urban Air Toxics Research Center and The National Center for Health Statistics
ABOUT THE NUATRC The Mickey Leland National Urban Air Toxics Research Center (NUATRC or the Leland Center) was established in 1991 to develop and support research into potential human health effects of exposure to air toxics in urban communities. Authorized under the Clean Air Act Amendments (CAAA) of 1990, the Center released its first Request for Applications in 1993. The aim of the Leland Center since its inception has been to build a research program structured to investigate and assess the risks to public health that may be attributed to air toxics. Projects sponsored by the Leland Center are designed to provide sound scientific data useful for researchers and for those charged with formulating environmental regulations. The Leland Center is a public-private partnership, in that it receives support from government sources and from the private sector. Thus, government funding is leveraged by funds contributed by organizations and businesses, enhancing the effectiveness of the funding from both of these stakeholder groups. The U.S. Environmental Protection Agency (EPA) has provided the major portion of the Center’s government funding to date, and a number of corporate sponsors, primarily in the chemical and petrochemical fields, have also supported the program. A nine-member Board of Directors oversees the management and activities of the Leland Center. The Board also appoints the thirteen members of a Scientific Advisory Panel (SAP) who are drawn from the fields of government, academia and industry. These members represent such scientific disciplines as epidemiology, biostatistics, toxicology and medicine. The SAP provides guidance in the formulation of the Center’s research program and conducts peer review of research results of the Center’s completed projects. The Leland Center is named for the late United States Congressman George Thomas “Mickey” Leland from Texas who sponsored and supported legislation to reduce the problems of pollution, hunger, and poor housing that unduly affect residents of low-income urban communities.
This project has been funded wholly or in part by the United States Environmental Protection Agency under assistance agreement X83234601. The contents of this document do not necessarily reflect the views and policies of the Environmental Protection Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use.
The National Health and Nutrition Examination Surveys (NHANES) Volatile Organic Compound Dataset: An Introduction to the Project and Analyses of the Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics
A Collaborative Project of The Mickey Leland National Urban Air Toxics Research Center and The National Center for Health Statistics
TABLE OF CONTENTS 3 3 3 3 3
BACKGROUND AND PURPOSE THE MICKEY LELAND NATIONAL URBAN AIR TOXICS RESEARCH CENTER (NUATRC) THE NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEYS (NHANES) THE NUATRC-NCHS COLLABORATION: THE VOC PROJECT PURPOSE OF THIS REPORT
4 4 4 4 4 4 5
THE VOC PROJECT OBJECTIVE VOC MEASUREMENT REVIEW OF LABORATORY ANALYSES QUALITY CONTROL AND QUALITY ASSURANCE PROCEDURES BLOOD LEVEL VOCS PUBLIC RELEASE OF THE VOC DATASET
ANALYSIS OF THE NHANES VOC DATASET
5
CONCLUSION
6
REFERENCES
6
ACKNOWLEDGMENTS
7
ABBREVIATIONS
7
JOURNAL MANUSCRIPT REPRINTS
9 9
DISTRIBUTIONS OF PERSONAL VOC EXPOSURES: A POPULATION-BASED ANAYSIS (JIA, ET AL) PREDICTORS OF PERSONAL AIR CONCENTRATIONS OF CHLOROFORM AMONG U.S. ADULTS IN NHANES 1999-2000
19
(RIEDERER, ET AL) DEMOGRAPHIC, RESIDENTIAL, AND BEHAVIORAL DETERMINANTS OF ELEVATED EXPOSURES TO BENZENE, ETHYLBENZENE, AND XYLENES AMONG U.S. POPULATION: RESULTS FROM 1999-2000 NHANES (SYMANSKI, ET AL)
31
CHARACTERIZING RELATIONSHIPS BETWEEN PERSONAL EXPOSURES TO VOCS AND SOCIOECONOMIC, DEMOGRAPHIC, BEHAVIORAL VARIABLES (WANG, ET AL)
2
41
NUATRC RESEARCH REPORT NO. 16
The Mickey Leland National Urban Air Toxics Research Center and The National Center for Health Statistics
BACKGROUND AND PURPOSE THE MICKEY LELAND NATIONAL URBAN AIR TOXICS RESEARCH CENTER (NUATRC) The Clean Air Act Amendments of 1990 established a control program for sources of 187 “hazardous air pollutants,” or “air toxics” that may pose a risk to public health. With the passage of these amendments, Congress established the NUATRC to develop and direct an environmental health research program that would promote a better understanding of the risks posed to human health by the presence of these toxic chemicals in urban air. Established as a public/private research organization, the NUATRC's research program is developed with guidance from a Scientific Advisory Panel composed of scientific experts from academia, industry, and government and seeks to fill gaps in scientific data. NUATRC-funded research is intended to assist policy makers in the evaluation and promulgation of sound environmental health decisions. The NUATRC accomplishes its research mission by sponsoring research on human health effects of air toxics at universities and research institutions, by supporting periodic workshops to share the current science on air toxics, and by publishing NUATRC-funded study results in its “NUATRC Research Reports,” thereby contributing meaningful and relevant data to the peer-reviewed literature. THE NATIONAL HEALTH AND NUTRITION EXAMINATION SURVEYS (NHANES)
from a representative sample of the US population each year. About 5,000 randomly selected subjects per year are chosen, aged from birth onward, from 15 different locations across the nation. Participants provide demographic and health data and undergo physical examinations to assess their current health status. For this purpose, fully equipped Mobile Examination Centers (MECs) are transported to data collection sites, referred to as “stands,” so that medical personnel can conduct the exams on-site in a standardized manner. THE NUATRC-NCHS COLLABORATION: THE VOC PROJECT The NUATRC submitted a proposal in 1997 to the NCHS for a collaborative project that would measure personal exposures to volatile organic compounds (VOCs) among a representative subgroup of participants in NHANES 19992001. The collaborative project was designed to provide a profile of VOC exposures experienced by US adults during their daily activities. The NHANES-VOC project was a datagathering effort; the data are available on the NCHS website, as described below. To encourage wide use of the dataset for new research projects and scientific publications, the NUATRC released a Request For Applications (RFA) in 2006 entitled: “Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics: Analysis of the NHANES VOC Project Dataset.” Manuscripts written by the project grantees, based on their research under this program, are reproduced in this report. PURPOSE OF THIS REPORT
The National Health Survey Act, passed in 1956, authorized a continuing survey of the Nation's health to provide current statistical data on the effects of illness and disability in the US. To comply with the Act, the National Center for Health Statistics (NCHS) conducted three National Health Examination Surveys in the 1960s. In 1970, a nutrition component was added to the survey, and, between 1971 and 1994, NCHS conducted four National Health and Nutrition Examination Surveys (NHANES). These surveys were designed to capture specific consecutive time periods, usually of six years' duration, and data were released for three or six-year periods. In these surveys, data on individuals were typically collected by at least three approaches: through direct interview, physical examination, and by clinical testing and measurement. With the inception of the 1999 NHANES, the survey became a continuous annual event. It now collects data
NUATRC RESEARCH REPORT NO. 16
This report is intended to inform the research community about the NUATRC- and NCHS-funded VOC database so that it can be accessed for future data mining activities. It also features the analyses of four investigators funded by NUATRC to analyze the dataset; their work highlights the utility of the dataset in understanding the national distribution of personal exposures to VOCs and determinants of these exposures. Their work can be used by other investigators to generate hypotheses about potentially significant exposure sources and pathways for VOCs in the general US population.
3
The Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics
THE VOC PROJECT OBJECTIVE The NUATRC proposed a project that would collect personal exposure data on specific VOCs in a representative subset of NHANES participants. Such data would provide information on the distribution of personal exposures to these hazardous air pollutants in the US population. If such an effort were continued, it would provide valuable information on trends over time of these exposures and also help evaluate impact of regulations to control these hazardous air pollutants. The NUATRC proposal was accepted by NCHS, and the Collaborative NCHS-NUATRC VOC Project (VOC Project) became a three-year component of the NHANES survey during the period 1999-2001. The aim of the project was to collect personal exposure data about specific VOCs in a representative subset of NHANES participants between the ages of 20 and 59 years. The target sample size for the VOC Project was 1,000 participants over the three-year period. Personal exposure data were obtained for periods of 48 to 72 hours, using small lightweight passive sampling badges that subjects wore from the time they left the MECs until they returned to the MEC 48 to 72 hours later. Eligible participants were recruited after completion of their physical examinations. Activity data for the exposure periods were collected from participants by means of a questionnaire administered at the end of the exposure periods when the participants returned to the MEC. The participants also provided information about household characteristics at that time. VOC MEASUREMENT The VOCs measured in the personal exposure study included: benzene, chloroform, ethylbenzene, tetrachloroethene, toluene, trichloroethene, o-xylene, m-pxylene, 1,4-dichlorobenzene, and methyl tert-butyl ether (MTBE). The VOC passive exposure monitor (or badge) used in the study was the 3M Organic Vapor Monitor (Model 3520, 3M Company, St. Paul MN). All VOC analyses were performed in accordance with methods described in the 3M publication: “Organic Vapor Monitor Sampling and Analysis Guide- October 1998.” (http://multimedia.3m.com/mws/mediawebserver?66666U uZjcFSLXTtlX&6OXMtEVuQEcuZgVs6EVs6E666666--) Extraction efficiencies were determined in accordance with the 3M procedures. Method detection limits were
4
determined for each compound based on the standard laboratory methods. A Gas Chromatograph/Mass Spectrometer was used for analyses. Laboratory procedures and equipment standards followed accepted USEPA protocols. REVIEW OF LABORATORY ANALYSES During the three-year project period, two different laboratory contractors performed the badge analyses in two different time periods Exposure data for the first year and a half of the project was analyzed by Clayton Laboratories, and for the remainder of the project, by the Environmental and Occupational Health Sciences Institute (EOHSI) laboratory of the University of Medicine and Dentistry, New Jersey (UMDNJ), both contractors to the Leland Center. Prior to approving the release of the VOC Project data set, NCHS scientists conducted a review of the procedures followed by the two laboratory groups in order to assess the compatibility of the approaches taken by the two laboratories and the reasonableness of the data produced for the project. Although the methods used by the two contract laboratories differed from those used at NCHS, the results were judged to be comparable after the review was completed. QUALITY CONTROL AND QUALITY ASSURANCE PROCEDURES Laboratory procedures and equipment standards followed accepted USEPA protocols. For Quality Assurance purposes, 10 percent of samples were split and analyzed independently by the NUATRC contractor laboratory and an outside laboratory. The analyses of these paired samples were conducted at the two laboratories concurrently. The results were evaluated for consistency and accuracy. Quality Control procedures during the VOC Project included the collection and analysis of the following samples from each of the stands: two field blanks, one positive control, two duplicate pairs, and one office air sample. BLOOD LEVEL VOCS A subset of VOC Project participants also took part in a related NHANES component, sponsored by the Centers for Disease Control's (CDC) Center for Environmental Health (CEH). That component collected data on blood-level VOCs and home drinking water VOCs. Those study subjects were asked to bring samples of home drinking water to the MEC when they returned at the end of their exposure periods. The goal of the CEH Project was to characterize the
NUATRC RESEARCH REPORT NO. 16
The Mickey Leland National Urban Air Toxics Research Center and The National Center for Health Statistics
distributions of blood and water VOCs and to investigate possible relationships between them. PUBLIC RELEASE OF THE VOC DATASET After the three-year data collection period for the VOC Project ended, a Workshop was held to review the project data. Participants included a panel of six researchers with significant experience in conducting and evaluating community studies of environmental health effects (Edo Pellizzari of Research Triangle Institute, Paul Feder of Battelle, David Ashley of CDC, Thomas Stock of the University of Texas School of Public Health, Martin Harper of CDC, and Edward Avol of the University of Southern California Keck School of Medicine), NCHS scientists and staff, and NUATRC staff. At the conclusion of the Workshop, the Panel recommended that the 1999-2000 VOC Project dataset be released on the NCHS web site as part of the 1999-2000 NHANES data release. Data for ten VOCs were released in April 2005: benzene, chloroform, ethylbenzene, tetrachloroethylene, trichloroethylene, toluene, m-pxylene, o-xylene, 1,4 dichlorobenzene, and MTBE. The website for the1999-2000 NHANES dataset is: http://www.cdc.gov/nchs/nhanes/nhanes99_00.htm. The 2001 VOC Project dataset could not be publicly released because of the small size, and the risk of disclosure of individual information or identities in a one-year dataset. The three-year 1999-2001 VOC Project was released for use in the Research Data Center in 2007. The Research Data Center at NCHS was established to assist researchers whose projects require access to data that are confidential in nature, or might lead to the disclosure of confidential information or individual identities. These researchers are asked to submit proposals to the Research Data Center, describing their projects. If their proposals are approved, the staff will then prepare a dataset created for the particular project, while maintaining strict confidentiality, and can provide statistical programming and consulting expertise to facilitate the data analysis for the project. There are fees associated with using the Research Data Center. The Research Data Center is located at the NCHS headquarters office in Hyattsville, Maryland. Researchers may work onsite at the headquarters or may access their data at a remote site. Another option is to carry out the research at a Census Research Data Center. The web site address for this Center is: http://www.cdc.gov/nchs/r&d/rdc.htm
NUATRC RESEARCH REPORT NO. 16
ANALYSIS OF THE NHANES VOC DATASET To encourage wide use of the dataset for new research projects and scientific publications, the NUATRC released an RFA in 2006 entitled: “Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics: Analysis of the NHANES VOC Project Dataset.” In November 2006, the NUATRC awarded four one-year contracts. A condition of the award was that each investigator was to prepare a manuscript based on the project and submit it to a peer-reviewed publication. Grants were awarded to the following investigators: • Stuart Batterman, Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan • P. Barry Ryan, Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, Atlanta, Georgia • Elaine Symanski, Division of Epidemiology and Disease Control, University of Texas School of Public Health, Houston, Texas • Sheng-Wei Wang, Institute of Environmental Health, Taiwan (formerly of Environmental and Occupational Health Sciences Institute, Piscataway, New Jersey) In conformance with award requirements, each of these investigators published their findings in the peer-reviewed literature, and these publications (through agreement with the respective journals) are reprinted in the pages that follow. Briefly, Drs. Jia, D'Souza, and Batterman (2008) characterized distributions of personal exposures to ten of the VOCs measured in the 1999-2000 NHANES. This study provides graphs and tables that illustrate the national exposure distribution and compares the NHANES results to studies assessing VOC exposures among different populations. According to the Jia et al analyses, participants' exposures to VOCs vary dramatically. They identified four groups of possible emission sources: gasoline vapors and exhaust; tap water disinfection products; cleaning products; and gasoline additive (MTBE). They identified several methodological issues, and suggested that complete models for the distribution of VOC exposures require an approach that combines standard and extreme value distributions and carefully identifies outliers. Drs. Riederer, Bartell, and Ryan (2009) found that 8 of 10 US adults were exposed to detectable levels of chloroform.
5
The Relationship between Personal Exposures to VOCs and Behavioral, Socioeconomic, and Demographic Characteristics
Significant predictors of personal exposure to chloroform included: demographic (age, race/ethnicity) and housing characteristics (type of home, chloroform concentration in home tap water), and personal exposure microevents (leaving home windows open, visiting a pool). Reported showering activity was not a significant predictor of personal air chloroform in the study. The authors argued that NHANES measurements likely underestimated true inhalation exposures since subjects did not wear sampling badges while showering or swimming, and because of possible undersampling by the passive monitors. Drs. Symanski, Stock, Tee, and Chan (2009) investigated the relationship of socioeconomic, behavioral, demographic, and residential characteristics to personal exposures to benzene, toluene, ethylbenzene, and xylenes (BTEX) compounds among a subsample of the NHANES participants. Geometric mean (GM) levels were significantly higher for males for all compounds except toluene. For benzene, GM levels were elevated among smokers and Hispanics. Regression analyses suggested that the presence of an attached garage (for BTEX), having windows closed in the home during the monitoring period (for benzene and toluene), pumping gasoline (for toluene, ethylbenzene and xylenes), or using paint thinner, brush cleaner, or stripper (for xylenes) resulted in higher exposures in the general population. The results of these analyses confirmed findings of previous studies. Drs. Wang, Majeed, Chu, and Lin (2009) found that different subsets of behavioral, socioeconomic, and demographic variables were significant exposure predictors, depending upon the nature of the VOCs. Sociodemographic factors (e.g., race/ethnicity and family income) were generally found to influence personal exposures to three chlorinated compounds: chloroform, 1,4dichlorobenzene, and tetrachloroethane. For the BTEX compounds, housing characteristics (e.g., leaving windows open and having an attached garage), and personal activities related to the use of fuels or solvent-related products had a significant influence on exposures. Differences in BTEX exposures were also found in relation to gender due to differences in time spent at work/school and outdoors. The investigators presented a variety of statistical analysis techniques for resolving challenges and limitations of the dataset, including dealing with issues of outliers, collinearity, and interaction effects.
CONCLUSION A number of VOCs are among the air toxics listed in the 1990 Clean Air Amendments. Many of these compounds were known to be present in both indoor and outdoor air, but had not been monitored among the general population. Information on levels of exposure to these compounds was essential to determine the need for regulatory mechanisms to reduce the levels of hazardous air pollutants to which the general public is exposed. The NUATRC therefore embarked on a project with the NCHS to develop a profile of VOC exposures encountered by US adults in their daily activities. The NUATRC-NCHS collaborative project provides valuable data, revealing a national distribution of personal exposures to VOCs, which can be used to compare how exposures in individual communities relate to the national distribution. Because the NHANES characterized nationallevel VOC exposures using a population-based sampling strategy, the results represent non-occupational VOC exposures throughout the US. The results of the four NUATRC grant recipients can be used by other investigators in generating hypotheses about potentially significant exposure sources and pathways for VOCs in the general US population. The results may also help in developing approaches for minimizing VOC exposures and reducing environmental health risks in the general population. Other investigators are encouraged to access the dataset for future data mining activities.
REFERENCES Jia C, J D'Souza, and S Batterman. 2008. Distributions of Personal VOC Exposures: A Population-based Analysis. Environ Int 34(7): 922-931. Riederer AM, SM Bartell and PB Ryan. 2009. Predictors of Personal Air Concentrations of Chloroform among US Adults in NHANES 1999-2000. J Expo Sci Environ Epidemiol 19(3):248-259. Symanski E, TH Stock, PG Tee, W Chan. 2009. Demographic, Residential, and Behavioral Determinants of Elevated Exposures to Benzene, Toluene, Ethylbenzene, and Xylenes among the US Population: Results from 19992000 NHANES. J Toxicol Environ Health A 72(14):915-24. Wang SW, MA Majeed, PL Chu, HC Lin. 2009. Characterizing Relationships between Personal Exposures to VOCs and Socioeconomic, Demographic, Behavioral Variables. Atmos Environ 43:2296-2302.
6
NUATRC RESEARCH REPORT NO. 16
The Mickey Leland National Urban Air Toxics Research Center and The National Center for Health Statistics
ACKNOWLEDGMENTS
ABBREVIATIONS
The NUATRC wishes to express its sincere appreciation to the recipients of its NHANES VOC Project grants, Dr. Stuart Batterman at University of Michigan, Drs. Barry Ryan and Anne Riederer at Emory University, Dr. Elaine Symanski at the University of Texas, and Dr. Sheng-Wei Wang at Institute of Environmental Health in Taiwan as well as their research teams. The NUATRC also thanks Drs. Thomas Stock and Maria Morandi, who developed the original study design and questionnaire for the Pilot Study and Dr. Clifford Weisel of EOHSI, who supervised the analysis of badge samples. We also thank Brenda Gehan, NUATRC Project Coordinator; Clifford Johnson, Director of NHANES; Susan Schober, Senior Epidemiologist, NCHS; David Lacher, Medical Officer, NCHS; Lester Curtin, Senior Mathematical Statistician, NCHS; and NUATRC Scientific Advisory Panel, whose expertise, diligence, and patience have facilitated the successful completion of this report.
BTEX CAAA CDC CEH EOHSI
NUATRC RESEARCH REPORT NO. 16
benzene, ethylbenzene, toluene, and xylene Clean Air Act Amendments Centers for Disease Control Center for Environmental Health Environmental and Occupational Health Sciences Institute EPA Environmental Protection Agency GM geometric mean MTBE methyl tert-butyl ether MEC mobile examination center NCHS National Center for Health Statistics NHANES National Health and Nutrition Examination Surveys NUATRC National Urban Air Toxics Research Center RFA Request for Applications SAP Scientific Advisory Panel UMDNJ University of Medicine and Dentistry, New Jersey VOC volatile organic compound VOC Project Collaborative NCHS-NUATRC VOC Project
7
Environment International 34 (2008) 922–931
Contents lists available at ScienceDirect
Environment International j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / e n v i n t
Distributions of personal VOC exposures: A population-based analysis Chunrong Jia, Jennifer D'Souza, Stuart Batterman * University of Michigan, Ann Arbor, MI 48109-2029, USA
A R T I C L E
I N F O
Article history: Received 19 November 2007 Accepted 10 February 2008 Available online 1 April 2008 Keywords: Benzene Distribution Exposure Gumbel Log-normal MTBE Outliers Personal Risk Volatile organic compound VOCs
A B S T R A C T
Information regarding the distribution of volatile organic compound (VOC) concentrations and exposures is scarce, and there have been few, if any, studies using population-based samples from which representative estimates can be derived. This study characterizes distributions of personal exposures to ten different VOCs in the U.S. measured in the 1999–2000 National Health and Nutrition Examination Survey (NHANES). Personal VOC exposures were collected for 669 individuals over 2–3 days, and measurements were weighted to derive national-level statistics. Four common exposure sources were identified using factor analyses: gasoline vapor and vehicle exhaust, methyl tert-butyl ether (MBTE) as a gasoline additive, tapwater disinfection products, and household cleaning products. Benzene, toluene, ethyl benzene, xylenes chloroform, and tetrachloroethene were fit to log-normal distributions with reasonably good agreement to observations.1,4-Dichlorobenzene and trichloroethene were fit to Pareto distributions, and MTBE to Weibull distribution, but agreement was poor. However, distributions that attempt to match all of the VOC exposure data can lead to incorrect conclusions regarding the level and frequency of the higher exposures. Maximum Gumbel distributions gave generally good fits to extrema, however, they could not fully represent the highest exposures of the NHANES measurements. The analysis suggests that complete models for the distribution of VOC exposures require an approach that combines standard and extreme value distributions, and that carefully identifies outliers. This is the first study to provide national-level and representative statistics regarding the VOC exposures, and its results have important implications for risk assessment and probabilistic analyses. © 2008 Elsevier Ltd. All rights reserved.
1. Introduction Information regarding the distribution of pollutant concentrations is used to answer many important questions in exposure and risk assessment, such as ‘What is the variability of the exposure estimates?’(US EPA, 1992), and ‘How many individuals have exposure over a given risk-based threshold?’ In the context of risk management, this information is needed to apportion emission sources and, more generally, to evaluate policies and interventions: ‘What fraction of exposure is due to occupational exposure, traffic, indoor and other sources?’ and: ‘Would controlling emission sources in residential garages significantly reduce benzene exposure?’ (Batterman et al., 2007; Loh et al., 2007). The availability, and then the form and parameterization of distributions are critical assumptions that determine the answers to such questions. While the use of “standard” distributions has been encouraged when feasible (Finley and Paustenbach, 1994), typical statistical measures of central tendency and dispersion, such as means, medians and standard deviations, and the common assumption of log-normality may inadequately describe the true distribution. Probabilistic methods, which use probability distributions instead of point estimates to represent the range of possible exposures, are potentially more representative of actual * Corresponding author. Tel.: +1 734 763 2417. E-mail address:
[email protected] (S. Batterman).
exposures (Sielken and Valdez-Flores,1999) and have been considered the most promising technique to emerge in the field of exposure assessment (Nieuwenhuijsen et al., 2006). These remarks apply to air pollutant measurements obtained using ambient, indoor and personal monitoring, and to many other types of environmental measurements. Further, they apply to both longitudinal (sequences) and crosssectional (spatial) data, with some differences resulting from the types of correlations involved. Pollutant distributions used in exposure and risk analyses are usually derived from empirical data, and measurements using personal monitoring are considered to be the best approximations to actual exposure (NRC,1991). While personal monitoring has been used for many pollutants, e.g., particulate matter, nitrogen oxides and volatile organic compounds (VOCs), previous studies have not used a population-based sample, and thus are not necessarily representative of a broad population. In addition, the databases underlying many studies used to estimate distributions may be unavailable, inconsistent in quality, and difficult to understand. Indeed, it is a mammoth task to design, recruit, monitor, quality-assure and evaluate a populationbased program, especially for large regions like the U.S. Importantly, if the assumed pollutant distribution is not representative, then pre dictions may not reflect true exposures, and conclusions regarding exposures and risks may be erroneous. The objective of this study is to characterize the distributions of personal exposures to VOCs in the U.S. measured in the 1999–2000
0160-4120/$ – see front matter © 2008 Elsevier Ltd. All rights reserved. doi: 10.1016/j.envint.2008.02.002
Jia et al: Reprinted from Environment International, 34(7), Jia C, J D'Souza, and S Batterman, “Distributions of Personal VOC Exposures: A Population-based Analysis,” 922-931, 2008, with permission from Elsevier.
C. Jia et al. / Environment International 34 (2008) 922–931
National Health and Nutrition Examination Survey (NHANES). This population-based survey represents what is believed to be the largest study of VOC exposures in a community setting. The behavior of the full range of the measurements is described using common statistical distributions. We use correlations and factor analyses to identify related VOCs and possible sources, and compare measurements to risk-based levels. We then fit extreme concentrations to the maximum Gumbel distribution, and address the issue of outliers. We conclude by contrasting the NHANES measurements with several other recent studies of personal VOC exposures. 2. Methods 2.1. NHANES NHANES was designed primarily to assess the health and nutritional status of adults and children in the U.S. through interviews and physical examinations. Surveys were conducted periodically from 1971 to 1994, and became continuous in 1999. The current NHANES (also known as continuous NHANES) was initiated in 1999 and uses a 2-year survey cycle. In the overall NHANES 1999–2000 sample, there were 9965 participants (5161 adults and 4804 children ≤ 18 years of age). Participants were sampled through a stratified, multistage probability sampling scheme (CDC, 2006a,b). Initially, counties (or blocks of counties) were selected. Within counties, groups of blocks (household clusters) were chosen. Letters were sent to selected households within those blocks, informing them of the study, after which NHANES staff visited the households and one or more participants were interviewed from the household. Five sub-populations were over-sampled to ensure sufficient sample size, specifically, lowincome persons, adolescents 12–19 years, persons ≥60 years of age, African Americans, and Mexican Americans. The 1999–2000 survey was the first to measure personal exposure to VOCs. A sub-sample of 851 adults (ages 20–59 years) of the overall NHANES sample was selected to participate in these measurements. The sub-sample is based on a one-fourth sample from 1999 and a one-third sample from 2000, and was designed to be nationally representative. 2.2. VOC sampling and analysis Personal VOC exposures were collected on the adult sub-sample selected from the NHANES sample. There were no additional exclusion criteria. Participants were instructed to wear badge-type passive exposure monitors (3M 3520 OVM, 3M Co., St. Paul, MN) for 48–72 h. Additionally, participants were administered a short questionnaire regarding the length of time they wore their badge and 30 other questions on factors potentially related to VOC exposures, e.g., contact with dry cleaning, tobacco smoke and gasoline vapor over the past several days. These questions were not included in the larger NHANES survey. VOC badges were chemically desorbed and analyzed by gas chromatography/mass spectrometry (GC/MS, HP 5890/5972 MSD, EnviroQuant ChemStation, Hewlett-Packard, Palo Alto, CA) following well-defined protocols and QA/QC protocols (CDC, 2006c; Weisel et al., 2005a; Chung et al., 1999a,b). VOCs included benzene, toluene, ethyl benzene, m,p-xylene, o-xylene (i.e., BTEX compounds), chloroform, trichloroethene (TCE), tetrachloroethene (PERC), 1,4-dichlorobenzene (p-DCB) and methyl tert-butyl ether (MTBE) (CDC, 2006c). Properties and method detection limits (MDLs) of these
923
compounds are summarized in Table 1, and the MDLs determined by Weisel et al. (2005a) were applied in this paper. 2.3. Data acquisition and cleaning Data were extracted from the 1999–2000 NHANES databases, maintained at the Center for Disease Control and Prevention's (CDC) website (www.cdc.gov/nchs/about/ major/nhanes/lab99_00.htm). The original dataset contained 851 cases (individuals) and 53 variables, which included the participant's identification number, concentrations and detection status of the ten VOCs, sampling information (including number of hours the badge was worn), house characteristics, and participant activities. The dataset also contain sampling variables specific to the VOC dataset, which represent the influence of the observation in extrapolating to the national level, and which account for the clustering in the data. These variables allow the results to be generalized to the U.S. civilian non-institutionalized population. Due to the clustering, the total variance also includes intra-cluster correlation, since observations within a cluster tend to be similar. Not accounting for the clustering gives incorrect variance estimates and inflated significance. Of the 851 cases, 182 were non-respondents and were excluded from further analyses. Two cases with excessively long sampling periods (5.7 and 7.9 days, participants #578 and #468, respectively) were excluded. An initial screening analysis identified two outliers (participants #3852 and #4076) with extremely high concentrations of BTEX (N 2000 μg m− 3 of ethyl benzene and xylenes for #3852, and N6000 μg m− 3 of toluene for #4076). These two cases were excluded. The final dataset included 665 participants. 2.4. Data analysis As simple indicators of exposure, we defined two new variables: BTEX as the sum of the five BTEX components; and TVOC10 as the sum of the ten VOCs measured in NHANES. The sums also used one-half of the MDLs for non-detects. Analysis started with basic descriptive statistics, including sample size, detection frequency (DF), average, standard deviation and percentiles. Spearman rank correlation coefficients were calculated to investigate the relationship among pairs of VOCs using the weighted dataset. The statistical significance of the correlations was determined for each VOC pair as the minimum p-value from two linear regressions of each VOC on the other, also using the weights as well as appropriate variance estimates. This procedure was used for |r| N 0.4, and coefficients were considered significant for p ≤ 0.05. These statistics were generated by SAS-callable SUDAAN (release 9.0, Research Triangle Institute, Research Triangle Park, NC, U.S.) and the survey procedures in SAS 9.1 (SAS Institute Inc., Cary, NC, U.S.), which contain algorithms that properly weight cases and account for the non-random and clustered sampling of the NHANES data. Factor analysis was used to help identify common VOC sources and to identify a subset of four VOCs with varying properties and different sources for further analysis in the present paper (Supplemental materials give results for all ten VOCs). This analysis used log-transformed unweighted data as full concentrations of most compounds were roughly log-normally distributed (see results), and varimax rotations. Our analysis focused on the larger factor loadings, typically N0.6. These analyses used SAS 9.1. To fit distributions of the full range of concentrations and extreme values, we synthesized a derived dataset (n = 14,898) in which cases were repeated with the frequency of repetitions based on the case weights. This approach yields valid statistics when the variance and correlation among variables was unimportant, e.g., univariate analyses. Distributions were fitted by maximum likelihood estimation (Thompson, 1999a) using a sample size N 10,000 to achieve a high level of reliability in distributional
Table 1 Physical and chemical properties and method detection limits (MDLs) of the 10 VOCs VOC
Benzene Toluene Ethyl benzene m-Xylene e p-Xylene e o-Xylene 1,4-Dichlorobenzene Chloroform Trichloroethene Tetrachloroethene Methyl tert-butyl ether
Abbreviation
Benzene Toluene Ethyl benzene m-Xylene p-Xylene o-Xylene p-DCB Chloroform TCE PERC MTBE
Chemical formula C6H6 C7H8 C8H10 C8H10 C8H10 C8H10 C6H4Cl2 CHCl3 C2HCl3 C2Cl4 C5H12O
CAS no.
MW
MP (°C)
71-43-2 108-88-3 100-41-4 108-38-3 106-42-3 95-47-6 106-46-7 67-66-3 79-01-6 127-18-4 1634-04-4
78.1 92.1 106.2 106.2 106.2 106.2 147.0 119.4 131.4 165.8 88.2
5.5 − 95.0 − 95.0 −47.9 13.3 − 25.2 53.0 − 63.5 − 84.8 − 22.4 −108.6
BP (°C) 80.1 110.6 136.2 139.1 138.4 144.4 174.1 61.2 87.0 121.3 55.2
MDL (μg m− 3) a EOHSI 1.1 6.7 0.74 1.4 1.4 0.85 0.91 0.42 0.44 0.42 0.68
c
UTSPH 0.54 7.12 0.22 0.65 0.65 0.29 0.43 0.28 0.24 0.22 0.38
Unit Risk b d
UTSPH 0.7 5.5 NA NA NA NA 2.2 0.3 NA 1.1 NA
d
−3
(per μg m ) −6
7.8 × 10 NA NA NA NA NA NA 2.3 × 10− 5 NA NA NA
RfC b (μg m− 3) 30 NA 1000 NA NA NA 800 NA NA NA 3000
CAS=Chemical Abstracts Service, MW=molecular weight, MP=melting point, and BP=boiling point are all from the CRC handbook (Lide, 2005). RfC=Reference concentration, Unit Risk=carcinogenic slope factor. a Based on 48-hour samples. b From US EPA (2007) showing the high unit risk estimate for benzene (low estimate is 2.2 × 10− 6). c From Weisel et al. (2005a). d From Chung et al. (1999b). e m- and p-xylenes cannot be separated in the method, and they are considered as one compound.
924
C. Jia et al. / Environment International 34 (2008) 922–931
Table 2 Descriptive statistics of weighted data including the ten VOCs plus BTEX and TVOC10 VOC
Benzene Toluene Ethyl benzene m,p-Xylene o-Xylene BTEX p-DCB Chloroform TCE PERC MTBE TVOC10
Missing
21 30 26 22 22 15 24 17 24 26 24 13
DF
Mean
SD
GM
GSD
(%)
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
5.3 36.4 8.4 18.8 6.5 74.4 27.3 2.7 3.4 5.2 5.2 117.3
7.0 107.3 41.3 43.2 14.5 153.0 120.7 4.5 22.7 31.2 15.6 200.9
3.2 17.5 2.9 7.2 2.8 36.5 3.2 1.4 0.4 1.0 1.4 61.9
2.6 2.8 3.2 3.5 3.2 2.9 5.7 3.0 3.4 4.1 4.1 2.9
65.5 93.6 93.0 95.9 92.5 97.6 62.9 79.3 22.9 69.0 27.8 99.2
Skewness
Min
25th
Median
75th
90th
95th
99th
Max
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
(μg m− 3)
0.7 1.7 0.1 0.2 0.1 0.8 0.3 0.2 0.1 0.1 0.4 0.6
1.4 9.2 1.3 3.3 1.3 18.6 0.9 0.6 0.2 0.4 0.5 31.1
2.8 17.4 2.6 6.5 2.4 33.2 1.7 1.1 0.3 0.7 0.6 55.6
5.8 29.9 5.2 14.6 4.9 66.6 9.2 3.0 0.5 2.4 5.5 106.0
13.5 59.8 14.2 38.7 14.1 152.8 34.8 5.9 1.2 6.6 10.7 273.4
18.7 98.3 25.2 69.8 26.4 285.3 142.1 12.1 7.4 18.5 21.3 382.8
32.6 331.1 110.9 233.0 62.5 784.4 490.8 25.4 75.5 76.8 50.0 1206.4
119.5 1610.8 837.1 728.7 202.3 1966.2 2235.6 53.9 327.3 659.1 181.7 2276.1
4.3 10.3 17.7 5.8 6.6 6.6 11.5 4.4 11.0 16.1 7.9 5.3
n = 665. Percentiles below method detection limits (MDLs) are italicized. DF=Detection frequency; SD=standard deviation; GM=geometric mean; GSD=geometric standard deviation. Nondetects were set to one-half of the MDLs. Notes: As noted in the text, two cases were censored as outliers: Subject #4076 had a toluene concentration of 6280 μg m− 3 (TVOC10 = 6488 μg m− 3); Subject #3852 had ethyl benzene, m,p-xylene and o-xylene concentrations of 2210, 8370 and 2321 μg m− 3, respectively (TVOC10 = 14,287 μg m− 3).
attribution (Haas, 1997). Goodness-of-fit was evaluated using Anderson–Darling (A–D), Kolmogorov–Smirnov (K–S), and Chi-square (χ2) tests, and by visually examining probability plots and histograms. The A–D test served as the primary criterion since it is suitable for fitting distributions with extreme tails, and thus appropriate for the extrema emphasized here. Smaller A–D statistics indicate better fits. The other tests help to confirm or improve the selection. These analyses primarily used Crystal Ball (Decisioneering, Inc., Denver, CO, U.S.). To test whether the highest concentrations fit a maximum Gumbel distribution, a form used in several earlier air pollution analyses (Roberts, 1979a,b), we used a relatively simple procedure (Barnett, 1975) in which each ordered extreme value Ci is plotted against quantity −ln[− ln(Pv)], where Pv is: Pv ¼ ðr 0:44Þ=ðN þ 0:12Þ
ð1Þ
and where r = the reverse rank of Ci, and N = the number of the extreme values. A good fit (e.g., R2 near unity) to the linear regression line confirms the appropriateness of this distribution. This analysis was performed for the top decile among all participants (n = 64–65 cases after eliminating missing data), and also for the top 5% of concentrations that exceeded MDLs (n = 11–30 cases, depending on the VOC).
3. Results 3.1. Descriptive analysis
with high exposures to certain VOCs, we calculated the fraction with exposures that exceeded the reference concentration or excess lifetime cancer risk levels of 10− 4, 10− 5 and 10− 6, with the strong assumption that the short-term NHANES measurement was representative of long term exposures. Nearly all (N99%) of the measurements fell below the reference concentrations. A few (b1%) of the benzene and ethyl benzene measurements exceeded reference concentrations. However, 77 and 10% of the NHANES measurements exceeded benzene concentrations that correspond to lifetime individual risks of 10− 5 and 10− 4 (1.3 and 12.8 µg m− 3, respectively) (the upper bound cancer-slope factor in IRIS was used for benzene). For chloroform, 86 and 16% exceeded these risk levels (0.4 and 4.4 μg m− 3, respectively). However, because benzene's MDL (typically 1.1 μg m− 3) corresponds to a risk level of 8.6 × 10− 6, and chloroform's MDL (0.4 μg m− 3) corresponds to 9.7 × 10− 6, the statistics for the 10− 5 risk level (and lower) may not be meaningful. Still, statistics for the higher exposures are significant and striking — there are few other environmental pollutants that yield ≥10− 4 risks in 10–16% of the population. The median risks for benzene and chloroform (2.2 × 10− 5 and 2.3 × 10− 5, respectively) also are very similar to predictions based on microenvironmental concentrations and time activity patterns (Loh et al., 2007), although the fraction of the NHANES subjects with risks ≥10− 4 for these compounds appears to exceed the upper range of predictions. This suggests that a full range distribution provides a poor fit to extrema, which deserves special attention since these extrema represent the most exposed individuals. In the following, we discuss the major VOC groups and individual compounds.
3.1.1. BTEX compounds
Descriptive statistics for the NHANES 1999–2000 VOC data are given in Table 2 (Supplementary materials give the complementary unweighted analysis in Table S1). Most of the VOCs had detection frequencies (DF) exceeding 60%, except for TCE (DF = 23%) and MTBE (DF = 28%). Concentrations varied widely, reflected in large standard deviations and skewness coefficients. Chloroform's range was more restricted (b MDL to 54 µg m− 3). In most cases, statistics obtained using weighted and unweighted approaches were similar (Tables 2 and S2) although p-DCB and MTBE show several differences at the higher concentrations, e.g., the weighted 75th and higher percentile concentrations were much lower than the unweighted data for p-DCB, showing the importance of using population-based statistics. Of the ten VOCs, four had reference concentrations related to non-cancer toxicity and two had cancer-slope factors listed in the US EPA IRIS database (US EPA, 2007) (toxicity information for other VOCs is available elsewhere, but we restricted analyses to the IRIS list, which is peer-reviewed and widely accepted). To identify those individuals
Unsurprisingly, the five BTEX compounds were detected in nearly every sample (DF = 66 for benzene to DF = 96% for m,p-xylene). Toluene and m,p-xylene had the highest concentrations among the ten VOCs (medians of 17.4 and 6.5 μg m− 3, respectively), and toluene was the predominant VOC component among the ten VOCs for most (55%) participants. BTEX comprised the majority of TVOC10 (average percentage of BTEX:TVOC10 = 67 ± 25%). BTEX compounds often arise as a group, primarily from evaporated gasoline and vehicle exhaust. However, toluene and xylene also have many separate and indoor sources, e.g., paints, solvents, and cigarette smoke. Many studies have detected and reported high concentrations of the BTEX compounds (Raw et al., 2004; Saarela et al., 2003; Mohamed et al., 2002; Clayton et al., 1999).
3.1.2. Chlorinated compounds The four chlorinated compounds in the NHANES dataset had lower detection frequencies (23–79%) than the BTEX compounds. Typically, outdoor levels of these
Table 3 Spearman rank correlation coefficients for the 10 VOCs using the weighted data, with statistically significantly coefficients (p b 0.05) in bold VOCs Benzene Toluene Ethyl benzene m,p-Xylene o-Xylene p-DCB Chloroform TCE PERC
Toluene
Ethyl benzene
m,p-Xylene
o-Xylene
p-DCB
Chloroform
0.59
0.61 0.70
0.67 0.72 0.95
0.60 0.72 0.92 0.95
0.10 0.13 0.05 0.07 0.07
0.14 0.13 0.02 0.05 0.07 0.29
TCE
PERC
MTBE
− 0.10 0.02 0.09 0.07 0.09 0.08 0.02
0.04 0.12 0.17 0.18 0.21 0.04 0.15 0.41
0.22 0.11 0.18 0.20 0.19 0.00 0.06 0.21 0.24
C. Jia et al. / Environment International 34 (2008) 922–931
3.1.3. MTBE
Table 4 Identification and parameters of best-fit distributions VOCs
Best fits
Benzene Toluene Ethyl benzene m,p-Xylene o-Xylene BTEX p-DCB a
Log-normal Log-normal Log-normal Log-normal Log-normal Log-normal Pareto Log-normal Log-normal Pareto Log-normal Log-normal Weibull Log-normal Log-normal
Chloroform TCE a PERC MTBE a TVOC10
Distribution parameters
Goodness-of-fit tests
Location 4.95 29.40 5.79 15.89 5.47 65.52 0.31 14.41 2.51 0.12 0.90 2.75 0.38 3.75 108.34
Scale
Shape
A–D
p-value
5.89 39.76 9.96 30.95 9.28 97.65 – 63.78 3.94 – 1.65 6.82 1.38 9.40 155.62
– – – – –
150.1 76.7 107.4 86.1 132.8 95.5 393.2 441.6 109.4 635.1 1293.7 222.1 716.7 1278.1 92.0
b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.005 b 0.010 b 0.005 b 0.005
0.43 – – 0.79 – – 0.39 – –
a Log-normal distribution is not the best fit for p-DCB, TCE and MTBE, but estimated parameters for this distribution as well as the best-fit distributions are shown.
compounds are low, and exposure occurs mostly from indoor or especially occupational sources. Due to the differences among these compounds, each is discussed separately.
•
p-DCB levels were surprisingly high and showed tremendous variability (median = 1.7 μg m− 3, average = 27 μg m− 3, maximum = 2236 µg m− 3), possibly due to the use of mothballs, air fresheners and other deodorants (Sack et al., 1992; Wallace et al., 1987). p-DCB was the predominant VOC in 15% of the exposure measurements, and for these 133 participants, the median concentration was high, 61.7 µg m− 3.
•
Chloroform was found in most samples (79%) at a median concentration of 1.1 μg m− 3. Chloroform along with bromodichloromethane, dibromochloromethane and bromoform, are trihalomethanes (THMs) that are often formed as water disinfection byproducts when chlorine is added to water, and that can be released to indoor air when chlorinated tap water is used (Weisel et al., 1999).
•
Tetrachloroethylene (PERC) was found in 69% of samples with a median concentration of 0.7 μg m− 3. PERC is a component of dry-cleaning fluids, and high concentrations might result from wearing freshly dry-cleaned clothes or visiting a dry cleaner. Two measurements were extremely high (659 and 490 µg m− 3 for participants #9751 and #130), more than five times higher than the next measurement. It is puzzling, however, that these two participants did not report dry-cleaning exposure, breathing fumes from or using dry-cleaning fluid or spot remover. Subject #9751 spent an unusually large amount of time at work/school (mean = 9.4 h day− 1). Subject #130 worked with paint thinners, brush cleaners, or strippers as well as glues, adhesives, hobbies or crafts, and also reported having new carpet installed in the past 6 months, and possibly the high exposure might be explained by “exposure to solvents” that this individual reported.
•
925
Trichloroethylene (TCE) was detected in relatively few cases (DF = 23%). However, the top ten highest concentrations exceeded 300 µg m− 3. TCE has many industrial applications, e.g., it has been commonly used as a degreasing solvent, but residential uses are limited. Some exposure can occur from vapor intrusion into buildings from contaminated sub-soils and from other environmental sources, but the high concentrations suggest more immediate contact with solvents.
This gasoline additive was detected in 28% of the measurements, with six very high concentrations (98–182 µg m− 3). It was the predominant VOC in 5% (n = 45) of the subjects where the median level was 23 µg m− 3. MTBE has been used in gasoline in selected areas in the U.S. since 1979, though it is now being phased out. No other uses are likely to lead to public exposure. Thus, MTBE should not be detected in areas where MTBE is not in gasoline, and MTBE should be a unique tracer for gasoline vapor in areas where this compound is in gasoline. This suggests that a MTBE will have a bimodal distribution in the NHANES data which combines these two areas, as shown later. The geographic location of participants is not available for NHANES 1999–2000 (in contrast to earlier data) because the sample size is much smaller, estimates by geographic region are less stable, and the risk of identifying subjects is greater.
3.2. Correlations and factor analyses As expected, the BTEX compounds were strongly correlated (r N 0.60, Table 3), and the correlations between ethyl benzene, p,m-xylene and o-xylene were especially high (0.92 ≤ r ≤ 0.95). The latter three compounds co-exist in gasoline, as well as in other products where they are called “mixed xylenes” (ATSDR, 2005). Chlorinated compounds TCE and PERC showed moderate correlation (r = 0.41). Correlations among other VOCs were weak. Weighted and unweighted ( Table S3) correlation matrices were similar. The factor analysis identified three factors that explained 67% of the total variance when an eigenvalue cut-off of 1.0 was used, but results obtained unreasonably associated MTBE, the gasoline tracer, with the chlorinated solvents TCE and PERC. We then used a four factor analysis with a lower eigenvalue cut-off (0.8), which resolved this issue. This analysis explained 76% of the variance. Factor 1 included the BTEX compounds in which the mixed xylenes had very high loadings (N0.9), following from the correlations and showing that these VOCs nearly always occur together. Toluene and benzene had lower loadings (0.73 and 0.79, respectively), indicating that other factors contribute to these compounds. Factor 2 included TCE and PERC, which are mainly used in dry-cleaning products. Factor 3 contained p-DCB, a deodorant found especially in toilets, and chloroform, a water disinfection byproduct, thus this factor likely reflects exposures in bathrooms. Factor 4 contained only MTBE (loading of 0.83). These factors varied slightly depending on whether or not the data were log-transformed. The factor analyses helped confirm to identification of the major VOC groups and the likely sources of exposure. For further analysis in the present paper, we selected one compound from each four factors, specifically, benzene, PERC, chloroform and MTBE (Supplemental materials show the other VOCs).
3.3. Probability and frequency distributions 3.3.1. Full distributions Frequency distributions show “heavy” tails for the BTEX compounds, and high, narrow peaks at low concentrations with only a very few high observations for TCE, PERC and MTBE (Fig. S1). The latter three compounds were detected least frequently (i.e., many values were below MDLs), and their median concentrations were the lowest among the ten VOCs in NHANES. Bimodal distributions were observed for MBTE and chloroform. For MTBE, over 70% of measurements were below the MDL, which formed a mode around 0.7 µg m− 3; the second but smaller mode occurred around 7 µg m− 3. The lower mode of the bimodal distribution reflects MDLs obtained for those study participants living in areas where MTBE is not used, as well as those living in MTBE-use areas but who have very low exposure to gasoline vapors. The upper mode reflects MTBE-exposed participants living in MTBE-use areas. For chloroform, the lower mode at about 1 μg m− 3 may reflect both background levels and perhaps an erroneously low MDL (stated as 0.4 μg m− 3); the upper mode near 4 μg m− 3 may reflect individuals
Fig. 1. A. Observed cumulative frequency distribution for measurements, and fitted (log-normal) cumulative probability distribution for benzene concentrations. B. Probability plots for maximum Gumbel type distribution fitting both the top 5 and 10% of measurements. Points show individual measurements; lines show fitted distribution based on linear regression after removing outliers, as discussed in text.
926
C. Jia et al. / Environment International 34 (2008) 922–931
Fig. 2. Observed and fitted distributions for PERC. Otherwise as Fig. 1.
having higher exposure to chloroform. Each NHANES measurement (all VOCs) is assigned a unique MDL, which depends on the averaging time. Of the candidate distributions, log-normal distributions had the best fit to all VOCs except for p-DCB and TCE, which were assigned the Pareto distribution, and MTBE, which was assigned the Weibull distribution (Table 4). MTBE is a special case of a mixed distribution since, as just discussed, concentrations outside the MTBE-use area reflect MDLs, which in turn reflect the small amount of variation in the time that the badge was exposed. In the MTBE-use area, distributions would be expected to be roughly log-normal, paralleling benzene which also arises from gasoline-related sources. However, as noted, the MTBE distribution cannot be cleanly split since information on the locations of participants is unavailable. These distributions were selected using all observations (n = 665) and the A–D test; the K–S and χ2 tests gave similar results. However, goodness-offit tests usually rejected the candidate distributions, a typical result for environmental data, in part due to anomalies and measurement errors (Ott, 1995). Fitted and measured cumulative frequency distributions are compared for the four VOCs in Figs. 1–4 ( Fig. S2 shows similar plots for all ten VOCs, BTEX and TVOC10) Agreement was considered “good” if fitted quantiles were within ± 20% of the observations. Most concentrations below the 20th percentile were underestimated; however, these measurements usually fell below MDLs and risk-based values (Table 1). Otherwise, fits varied by VOC and percentile. The BTEX compounds and chloroform generally showed good agreement with log-normal distributions, although ethyl benzene and xylenes showed moderate differences, e.g., 65–80th percentiles were overestimated by 20–30%, and 99th percentile concentrations were underestimated by 13–60%. Other compounds showed poor agreement, e.g., 95th to 99th percentile concentrations measurements were underestimated by 30–65% for chloroform, TCE and PERC; overestimated for MTBE by 7–35%; and hugely overestimated for p-DCB (factor of 2–28). Fits for TCE and MTBE were also poor at intermediate percentiles. Lognormal distributions for p-DCB, TCE and MTBE, e.g., Fig. 4B, demonstrated poor fits that were clearly worse than the selected Pareto and Weibull distributions. The composite variables, BTEX and TVOC10, closely fit log-normal distributions, probably because these summations of VOCs tended to “average-out” disparities. While log-normal distributions provided moderately good fits to most compounds, both low and high concentrations were underestimated, and the middle range was overestimated. Geometric means were very close to medians for BTEX compounds, moderately higher for chlorinated compounds, and much higher for p-DCB and MTBE (Table 2). The highest concentrations (≥95th percentile) were significantly under-
predicted. The geometric standard deviations σg ranged from 2.6 (benzene) to 5.7 (pDCB), showing considerable variation and no clear groupings (Table 2). None of the candidate distributions fit p-DCB, TCE and MTBE, compounds with low detection frequencies (63, 23 and 28%, respectively). As elaborated in the Discussion, we speculate that these measurements reflect multiple circumstances: non-detects; moderate concentrations due to local but dispersed sources; and very high concentrations due to some unusual contact or exposure situation.
3.3.2. Extreme distributions Fitted and observed maximum Gumbel distributions for concentrations exceeding the 90th and 95th percentile concentrations are shown for benzene, PERC, chloroform and MTBE in Figs. 1–4, respectively; fitting results, e.g., goodness-of-fit as R2, are shown in Table 5 (plots for all ten VOCs are shown in Fig. S2). Considering the top decile and all of the data, R2 values were not impressive, ranged from 0.38 (ethyl benzene) to 0.89 (chloroform). Most VOCs attained only poor-to-fair fits. Several of the highest measurements exceeded the next highest measurement by about 2-fold, suggesting statistical outliers. These are also apparent as large deviations between measurement and the fitted Gumbel distributions. On this basis, we identified the following outliers:
•
Benzene, 1 measurement (119 µg m− 3, subject #5359, Fig. 1B). This individual reported using household disinfectants, degreasing cleaners or furniture polish.
•
Toluene, 6 measurements (1611, 1551, 1399, 1267, 797 and 668 µg m− 3 for subjects #4879, #8631, #2037, #4479, #2002, and #1002, respectively). All of these subjects reported at least one of the following activities: pumping gasoline into a car; near a smoking person for N10 min; and breathing fumes or using gasoline. Note that at the onset, we deleted two cases with still higher toluene levels (1352 and 6280 µg m− 3 for participants #3852 and #4076).
• • •
Ethyl benzene measurement, 1 measurement (837 µg m− 3, subject #4514).
•
m,p-Xylene measurement, 1 measurement (729 µg m− 3, subject #8801). o-Xylene, 3 measurements (202, 173 and 129 µg m− 3 for subjects #8801, #4514 and #8110, respectively). Subjects #4514 and #8801 reported being near a smoker for N 10 min. Subject #8110 reported pumping gasoline into a car. PERC, 2 measurements (659 and 490 µg m− 3 for subjects #9751 and #130, respectively, Fig. 2B). These subjects did not report any contact with dry-cleaning products.
Fig. 3. Observed and fitted distributions for chloroform. No outliers are removed from the maximum Gumbel distribution. Otherwise as Fig. 1.
C. Jia et al. / Environment International 34 (2008) 922–931
927
Fig. 4. Observed and fitted distributions for MTBE. “Fitted Distribution1” is Weibell distribution, the best fit. “Fitted Distribution2” is log-normal, shown for comparison. Otherwise as Fig. 1.
•
p-DCB, 4 measurements (2236, 2227, 1511, 1152 µg m− 3 for subjects #3294, #8172, #7929 and #9158, respectively). Three of these subjects reported deodorizer use (not #3294).
•
MTBE, 6 measurements (182,170, 159, 155,126 and 98 µg m− 3 for subjects #6514, #2031, #1551, #4350, #1002 and #7949, respectively, Fig. 4B). Five of these subjects (not #7949) reported pumping gasoline into a car, or breathing fumes or using gasoline.
Chloroform and TCE did not show obvious outliers. Interestingly, only four subjects had multiple outliers (#1551 for toluene and MTBE; #2002 for toluene and MTBE; #4514 for ethyl benzene and o-xylene; #8801 for m,p-xylene and o-xylene). In addition, subject #5359, who had a high benzene exposure, had the second highest chloroform concentration. Several of these concentrations are extremely high and indicate the presence of very strong and local sources, e.g., p-DCB concentrations of N 1000 μg m− 3 are likely due to the use of mothballs and deodorizers. If such exposures are infrequent, then the calculated lifetime exposures and risks may not be excessive. Unfortunately, the NHANES dataset does not allow an estimate of the frequency of such events. BTEX and TVOC10 did not show outliers other than the two cases (subjects #4076 and #3852) removed at the onset. After removing these measurements, the fit of the Gumbel distribution improved considerably, and most R2 values exceeded 0.75 with the exceptions of toluene and TCE. Results for chloroform and TCE were unchanged since no data were removed (the rationale and approach to such selective data censoring is discussed in the Discussion). Still, the top decile may not represent true extrema, especially if many measurements fall below MDLs. Thus, we refit the Gumbel distribution to the top 5% of the data that exceeded MDLs. This improved fit for all VOCs, especially for TCE for which the R2 jumped to 0.88. Removal of outliers further improved fits, giving R2 ≥ 0.85 for all VOCs except MTBE. Removing the top 8 MTBE measurements (2 additional points, rather than just the 6 noted earlier) improved MTBEs R2 to 0.87. Thus, with appropriate delineation of extrema and exclusion of outliers, extreme values can be closely modeled.
4. Discussion Measurements of environmental pollutants such as those in the NHANES VOC exposure database reflect multiple circumstances that
may be classified into four groups based on the capabilities of the monitoring method: (1) values falling below method detection limits (MDLs), which are frequently assigned an estimated or imputed value, e.g., 1/2 MDL; (2) detections or “traces” exceeding MDLs but still below quantitation limits (e.g., 10 σ), that can only be imprecisely determined; (3) values within the normal linear range of the instrument; and (4) “over-range” measurements that are likely to be under-reported due to saturation or other non-linear effects. Reported measurements are also prone to errors in the collection, analysis, data entry and other factors. Measurements also may be classified into four groups with respect to the phenomena that underlie the pollution or the pollutant “event” during the measurement period: (1) an absence of the pollutant; (2) generally low or “background levels” that arise due to contributions from distant or “regional” emission sources; (3) moderate-to-high concentrations from “local” or strong emission sources that are well-dispersed; and (4) occasional very high concentration “hits” yielding “extrema” due to “near-field” impacts, exceptionally strong sources, or a combination of moderately-to-strong sources and unfavorable dispersive conditions. For pollutants where MDLs are low, measurements often reflect contributions from both background and local sources. A conceptual understanding of these groupings and at least some quantification of the applicable concentration ranges, which do not have precise boundaries, are necessary to properly interpret measurements and distributions, including the identification of outliers. We note that few laboratories or investigators report performance measures that include limit of quantitation and linear dynamic range. Also, all VOC sampling techniques have limitations, and partial saturation of the adsorbents in the passive samplers used in NHANES will reduce their sampling uptake rate at
Table 5 Parameters of extreme distributions, including slopes, intercepts (IC), and R2 from Eq. 1 VOCs
Top 10th percentile
Top 5th percentile
With outliers Slope Benzene Toluene Ethyl benzene m,p-Xylene o-Xylene BTEX p-DCB Chloroform TCE PERC MTBE TVOC10
8 177 61 68 22 237 210 6 41 48 25 283
Without outliers
IC
R2
18 107 25 79 26 277 114 10 9 15 21 414
0.79 0.61 0.38 0.85 0.78 0.78 0.70 0.89 0.62 0.45 0.65 0.82
Slope 7 67 25 66 15 237 143 6 41 19 11 283
With outliers
IC
R2
18 105 27 80 26 277 122 10 9 17 20 414
0.83 0.69 0.75 0.85 0.84 0.78 0.89 0.89 0.62 0.81 0.82 0.82
Slope 12 389 157 104 38 446 395 8 119 130 56 441
Without outliers
IC
R2
9 −295 −147 22 −3 −124 −260 6 −170 −149 − 42 143
0.85 0.87 0.59 0.95 0.91 0.95 0.79 0.94 0.88 0.70 0.70 0.96
Slope 11 180 53 100 27 446 219 8 119 39 15 441
IC
R2
11 −45 −14 27 11 − 124 2 6 − 170 −13 20 143
0.86 0.92 0.94 0.95 0.94 0.95 0.93 0.94 0.88 0.96 0.72 0.96
No. of outliers removed 1 6 1 1 3 0 4 0 0 2 6 0
Results shown for data with and without outliers. ‘Top 10th percentile’ is the top 10% of all data, and ‘Top 5th percentile’ is the top 5% of the observations above MDLs. Number of outliers shown at right.
928
C. Jia et al. / Environment International 34 (2008) 922–931
long averaging times and at high concentrations, leading to negative biases at high concentrations or sampling times (Jia et al., 2007). 4.1. Probability distributions In general, pollutant concentrations and exposures are random in nature as they depend upon a number of variable factors, e.g., emission rates, microenvironmental characteristics, time activity budgets, and human activities. Often, basic information regarding VOC concentrations or exposures is neither available, generalizable, nor certain. This is in strong contrast to distributions of other variables used in exposure and risk calculations, e.g., dosimetric parameters (e.g., body weight, intake rate) and time activity durations (Sexton et al., 1992; Finley et al., 1994), which are well-characterized and easily bounded. Notably, the variation in concentrations or exposures can dwarf the variation in other parameters (with the possible exception of toxicity parameters like cancer-slope factors). Further variation in results of measurement programs can be caused by a host of factors, including sampling and analysis methods, sampling time, study population, season, weather, etc. 4.1.1. Full distributions Early studies of probability distributions focused on ambient measurements of criteria pollutants in cities, e.g., carbon monoxide (Ott, 1979) and sulfur dioxide (Berger et al., 1982), and concentrations at all averaging times were usually found to approximate log-normal distributions (Larsen, 1969; Ott, 1995). In workplace settings, 8-hour time-weighted average concentrations also have been frequently represented using log-normal distributions (Nicas and Jayjock, 2002). Relatively few studies have examined distributions of VOC concentrations in non-occupational settings. Log-normal distributions were assigned to ten VOCs measured in 427 indoor air samples collected in residences in Denver, Colorado (Foster et al., 2003). Gamma distributions provided the best fit to concentrations of 28 VOCs (including 11 aldehydes) measured in 1417 Japanese homes (Park and Ikeda, 2004). A recent U.S. review reported the log-normal distribution as the best fit for 9 VOCs in most microenvironments, and the Gamma distribution for chloroform in dining rooms (Loh et al., 2007). As in the ambient, workplace and indoor studies, log-normal distributions provided only an approximate fit, at best, for most of the ten VOCs examined in the present study. The fit was not always very good, especially for the less frequently detected compounds, and statistical tests of agreement usually failed. The practice of fitting and analyzing distributions of air pollutant concentrations has not become routine practice in exposure assessment. To determine the underlying distribution, measurements are generally matched to theoretical distributions using three steps: selection of a candidate distribution; estimation of its parameters; and assessment of the goodness-of-fit (US EPA, 1997). Despite the availability of automated software that can rapidly perform such analyses (e.g., Crystal Ball, Decisioneering, Inc., Denver, CO, USA; @Risk, Palisade Corporation, Ithaca, NY, USA; Risk Solver, Frontline Systems, Inc., Incline Village, NV, USA), it appears that the most common approach continues to be the assumption of log-normality. Thus, medians or geometric means are used as a measure of central tendency, and data are log-transformed for statistical inference testing. These statistics give little if any information regarding extrema, and the log-normal distributions rarely meet goodness-of-fit criteria. 4.1.2. Extreme distributions As we noted at the onset, few studies have used a sampling design or attained a sample size that is sufficient to characterize population exposure. Importantly, extrema can only be derived from large studies. Extreme values generally do not follow the distribution derived from the full range of the data. In many cases, a particular distribution or several distributions may reasonably approximate the middle 80%
of the values; however, it may be inappropriate for the top 5 or 10% of the data (Haas, 1997). Extreme concentrations of air pollutants were found to follow the Gumbel distribution when the full range was lognormally distributed (Singpurwalla, 1972). We recently found that Gumbel distributions were appropriate for the top decile concentrations of 23 VOCs and carbonyls measured in Michigan, U.S. (Le et al., 2007). The present study confirms that Gumbel distributions can be used to describe the extreme values (e.g., top 10th or 5th percentiles) of personal VOC exposures, with the caveat that a small number of outliers will still exceed the fitted distribution. As noted by Ott (1995), the upper tail of a distribution reflects a stochastic process, and it is insensitive to the type of the hypothetical distributions, regardless the original distribution producing the tail. Thus, a variety of distributions can fit the extreme values equally well. While the NHANES VOC extrema were well fit by the Gumbel distribution, we found that Gamma and Weibull distributions were selected for the top 10th percentile data, and Gamma and Beta distributions for the top 5th percentile data on the basis of A–D tests (data not shown). One advantage of the Gumbel distribution, however, is that its linear plot helps in the identification of outliers. Our experience analyzing the NHANES data provides guidance in fitting extrema. First, a large sample is required, and it is advantageous if most measurements exceed MDLs. Possibly 5% or fewer of the observed values above MDLs may be considered extrema. Second, distribution fitting cannot depend solely on goodness-of-fit tests, but also on subjective judgment. Third, while the Gumbel (and other) distributions are extreme value distributions, they may not fit outliers; thus, these points must still be identified and removed, and an iterative approach may be the best option. Such data censoring also may be necessary to improve model fit for both full and extreme value distributions. Such actions often and justifiably are criticized as “cherry picking”. We recognize the uncertainty of the data, and believe that most of the deleted values represent unusual cases. However, relatively common situations such as refueling a vehicle, smoking, and wearing freshly dry-cleaned clothes need more investigation to see if they can produce the very high measurements encountered. Still, the 24 censored measurements, plus the 2 censored cases representing 20 additional measurements, represent a very small percentage (0.7%) of the 6600 VOC measurements in NHANES. For most of these measurements, our initial examination of the NHANES survey data did not show anything unusual, though this investigation is ongoing. Unfortunately, in a study design like NHANES, follow-up interviews or repeated measurements to try to understand the exposure source and the reliability of the measurement are not possible. 4.2. Comparison of NHANES and other exposure estimates For some years, it has been known that exposure estimates derived using personal sampling often exceed exposures based on indoor monitoring, which in turn exceed measurements using outdoor or ambient monitoring. This can apply to VOCs (Sexton et al., 2004; Edwards et al., 2005), as well as other pollutants, e.g., particulate matter (Wallace, 2000). While this “personal pollution cloud” or “Linus effect” (after the comic strip character) is becoming better recognized, its strength and variability among individuals have not been quantified. Due to its significance, NHANES measurements should only be compared to other studies that use personal sampling. For VOCs, these include the Total Exposure Assessment Method (TEAM) studies in the 1980s (Wallace, 2001), the National Human Exposure Assessment Survey (NHEXAS) in the late 1990s, and more recently, the Relationships of Indoor, Outdoor and Personal Air (RIOPA) study. However, these (and other mostly smaller) studies are not necessarily representative of the U.S. population, and none used a population-based sampling strategy. Thus, these comparisons may reflect local or regional differences in VOC exposure.
C. Jia et al. / Environment International 34 (2008) 922–931
We selected three U.S. studies that measured personal VOC exposures that were more or less contemporaneous with NHANES. These were conducted in Minnesota (MN) by Sexton et al. (2004), in Maryland (MD) by Payne-Sturges et al. (2004), and in New Jersey, Texas and California (NJ/TX/CA; Weisel et al., 2005b). We also included the slightly earlier (mid-1990s) NHEXAS study (Clayton et al., 1999). Table 6 compares average, median and 90th percentile (95th percentile for NJ/TX/CA) concentrations reported in these studies. Measurements from all studies show the very strong effect of nonnormality, e.g., means are typically 2 to 3 times higher than medians (the NJ/TX/CA study shows a 30-fold difference for p-DCB). Largely due to the influence of high concentrations (including potential outliers), and to an extent due to the limited sample sizes (especially in the MD study), it is clear that averages do not provide robust measures of central tendency. Thus, the following discussion emphasizes non-parametric statistics. Of the four reported VOCs, median concentrations in NHEXAS significantly exceeded those in the more recent studies. This is unsurprising given the general downward trend in indoor and outdoor VOC concentrations (Hodgson and Levin, 2003). In the three other studies, medians and upper percentile statistics were similar to NHANES. Only three compounds showed sizable differences: • p-DCB: In MN, levels were very low (examining 50th and 90th percentiles), about 4 to 6 times lower than the NHANES data. In NJ/TX/ CA, medians were comparable to NHANES, but the 95th percentile concentration was extremely high (314 µg m− 3), twice that in NHANES (95th percentile concentration is 142 µg m− 3, Table 2). • TCE: NHANES data showed a median TCE level 1.7–2.6 times higher than those in the three other studies. • MTBE: In MD, the median MTBE level was nearly 5 times higher than the NHANES results, while the 90th percentile concentration was 6 times higher. The NJ/TX/CA statistics were 3 to 4 times higher. In cases, these studies emphasized highly traffic-exposed individuals, moreover, MTBE may be widely used in these study areas in comparison to NHANES, which included areas where it was not used. After censoring non-detected MTBE measurements, the NHANES data gave a of 6.2 µg m− 3, just slightly lower than levels in the NJ/TX/CA and MD studies. Note that this comparison is meaningful only if it is assumed that all or most measurements in MTBE usage areas would result in detections, which did occur for the other gasoline components (MTBE was not reported in the MN study). This comparison reveals several important findings. First, larger though localized studies can give statistics that are representative or
929
nearly so, judged on the basis of their similarity to the NHANES data, which is population-based and thus should be representative. This mainly applies to the BTEX compounds that are ubiquitous. Second, there is a need for additional and probably improved measurements of chlorinated compounds, especially since some or much of the interstudy variation seems likely to arise from MDL effects (lower MDLs are needed). Finally, as noted earlier, when a pollutant like MTBE is used in only a subset of the region studied, the resulting statistics and derived distributions may not be reliable or nationally representative. 4.3. Importance and applications The analysis of the NHANES data suggests that representing the full range of VOC exposures requires a combined approach, namely, a lognormal (or other) distribution may be used for low to moderately high concentrations, and an extreme value distribution for the very highest (≥95th percentile) concentrations. It is the highest concentrations and exposures that may need control or mitigation, or drive policies to this effect, thus these values require further attention. Also, the shift from deterministic to probabilistic analyses, such as Monte Carlo methods, requires appropriate distributions of exposure parameters (US EPA, 1995), and fitting and assigning probability distribution is a first and critical step (Haas, 1997; Hamed and Bedient, 1997; Thompson, 1999b). Log-normal distributions are not always the first choice, and several VOCs appear to follow other distributions. All of the full distributions, that is, those that attempt to match all of the data, are likely to lead to the wrong conclusions concerning the level and frequency of extrema. 4.4. Study limitations We could not stratify the data to isolate regions where MTBE is used in gasoline, and thus a single distribution very poorly described MTBE concentrations. There are no replicates in the NHANES dataset, uncertainty estimates for individual datum, or opportunities to further investigate outliers. Exposure assumptions were simplified, i.e., shortterm NHANES measurements were extrapolated to estimate lifetime exposures without adjustment for trends and uncertainties. We also note that the risk levels and reference concentrations used are protective guidelines, not standards. As concentrations of many VOCs are decreasing, the fitted distributions and other statistics in the present paper will likely need updates in future years. Our identification of the factors that explain the variation in the dataset is tentative, and might change with additional information. Finally, it should be recognized that due to correlations among VOCs, univariate analyses cannot be
Table 6 Results from selected studies of personal exposure to VOCs in the U.S. since 1990, and comparison to NHANES Study area
NHANES
RIOPA
NHEXAS
U.S.
Elizabeth, NJ; Houston, TX; Los Angeles, CA
IL, IN, OH, MI, MN, WI
Period
1999–2000
1999–2001
Sample size
665
545
Statistics
Mean
Median
Q90
Mean
Median
Q95
Mean
Median
Q90
Mean
Median
Q90
Mean
Median
Q90
5.3 36.4 8.4 18.8 6.5 27.3 2.7 3.4 5.2 5.2
2.8 17.4 2.6 6.5 2.4 1.7 1.1 0.3 0.7 0.6
13.5 59.8 14.2 38.7 14.1 34.8 5.9 1.2 6.6 10.7
3.6 19.2 2.8 8.1 2.9 56.7 4.2 1.0 7.1 14.8
2.4 12.2 1.7 4.4 1.7 1.9 1.0 0.1 0.6 7.1
10.7 50.2 7.5 22.7 8.1 314.0 6.3 1.9 7.2 42.7
7.5 NA NA NA NA NA 2.3 5.3 31.9 NA
5.4 NA NA NA NA NA 2.0 0.6 2.0 NA
13.7 NA NA NA NA NA 4.5 6.0 10.8 NA
7.6 30.3 5.6 21.0 6.8 3.2 1.5 1.0 31.8 NA
3.2 17.1 2.2 7.4 2.3 0.4 1.0 0.2 0.9 NA
18.3 62.9 11.8 48.6 15.6 5.1 3.9 1.4 7.0 NA
4.1 26.8 4.4 17.8 a NA NA 4.8 0.4 3.0 24.7
2.9 14.7 2.5 9.5 a NA NA 2.3 0.2 0.9 8.8
7.3 41.3 9.5 30.9 a NA NA 7.8 0.8 8.2 66.6
Benzene Toluene Ethyl benzene m,p-Xylene o-Xylene p-DCB Chloroform TCE PERC MTBE
Minneapolis, MN
South Baltimore, MD
1995–1997
1999.0
2000–2001
386
288
37
RIOPA is cited in Weisel et al., 2005b, NEXAS in Clayton et al., 1999, Minneapolis study in Sexton et al., 2004, and South Baltimore in Payne-Sturges et al., 2004. “Q90” and “Q95” are 90th and 95th percentile concentrations, respectively. “NA” is not available in the indicated study. a Includes m-, p- and o-xylenes.
930
C. Jia et al. / Environment International 34 (2008) 922–931
used to represent VOC mixtures, which represent a challenging public health issue (US EPA, 2000; ATSDR, 2000). 5. Conclusions This study explored the distribution of personal exposure measurements of VOCs, and its findings are relevant to health risk assessment and risk management. It is the first study to characterize VOC exposures at the national level using a population-based sampling strategy, thus, results should be broadly representative of non-occupational VOC exposures throughout the U.S. Eight of the ten VOCs monitored using personal sampling of 669 individuals in the NHANES dataset were detected in most samples. Exposures among study participants showed tremendous variability, ranging from below method detection limits to as high as 6280 μg m− 3 for individual compounds and 14,287 μg m− 3 as the sum of the ten VOCs in the NHANES dataset. Correlations and factor analysis identified four groups of possible emission sources: gasoline vapors and exhaust; tap water disinfection products; cleaning products, and gasoline additive (MTBE). Log-normal distributions were assigned to benzene, toluene, ethyl benzene, xylenes, chloroform and PERC with moderate-to-good agreement to observations. Different distributions were assigned to pDCB and TCE (Pareto distributions) and MTBE (Weibull distribution), all with considerably poorer fit. Extrema were fit to the maximum Gumbel distribution, and reasonable agreement was found for most compounds, especially after censoring outliers and defining extrema as the top 5% of measurements above MDLs. The dataset contained a small fraction (b1%) of extremely high concentrations, considered to be outliers as they did fit neither the full nor extreme value distributions. The NHANES exposure database suggests that log-normal distributions are not always the first choice for distributions, and that none of standard distributional forms provided a close match to the levels and frequencies of the highest exposure concentrations that pose the greatest risks. Acknowledgement This work was performed under the support of the Mickey Leland National Urban Air Toxics Research Center, Grant RFA 2006-01, entitled “The relationship between personal exposures to VOCs and behavioral, socioeconomic, demographic characteristics: analysis of the NHANES VOC project dataset.” Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.envint.2008.02.002. References ATSDR. Toxicological Profile for Xylene. Atlanta, GA: US Agency for Toxic Substances and Disease Registry; 2005. ATSDR. Guidance manual for the Assessment of Joint Toxic Action of Chemical Mixtures. Atlanta, GA: US Agency for Toxic Substances and Disease Registry; 2000. Barnett V. Probability plotting methods and order statistics. Appl Stat 1975;24:95–108. Batterman S, Jia C, Hatzivailis G. Migration of volatile organic compounds from attached garages to residences: a major exposure source. Environ Res 2007;104:224–40. Berger A, Melice JL, Demuth CL. Statistical distributions of daily and high atmospheric SO2 concentrations. Atmos Environ 1982;16:2863–77. CDC (Centers for Disease Control and Prevention). National Health and Nutrition Examination Survey 1999–2000 Public Data Release File Documentation. Hyattsville, MD: U.S. Department of Health and Human Services; 2006a. http://www.cdc. gov/nchs/data/nhanes/gendoc.pdf. Accessed 20 May 2007. CDC (Centers for Disease Control and Prevention). National Health and Nutrition Examination Survey 1999–2000 Data Documentation, Lab 21-Volatile Organic Compounds. Hyattsville, MD: U.S. Department of Health and Human Services; 2006b. http://www.cdc.gov/nchs/data/nhanes/frequency/lab21_doc.pdf. Accessed 20 May 2007. CDC (Centers for Disease Control and Prevention). National Health and Nutrition Examination Survey Questionnaire (or Examination Protocol, or Laboratory Protocol). Hyattsville, MD: U.S. Department of Health and Human Services;
2006c. http://www.cdc.gov/nchs/about/major/nhanes/frequency/lab21_doc.pdf. Accessed 20 May 2007. Chung CW, Morandi MT, Stock TH, Afshar M. Evaluation of a passive sampler for volatile organic compounds at ppb concentrations, varying temperatures, and humidities with 24-h exposures. 1. Description and characterization of exposure chamber system. Environ Sci Technol 1999a;33:3661–5. Chung CW, Morandi MT, Stock TH, Afshar M. Evaluation of a passive sampler for volatile organic compounds at ppb concentrations, varying temperatures, and humidities with 24-h exposures. 2. Sampler performance. Environ Sci Technol 1999b;33: 3666–71. Clayton CA, Pellizzari ED, Whitmore RW, Perritt RL, Quackenboss JJ. National Human Exposure Assessment Survey (NHEXAS): distributions and associations of lead, arsenic and volatile organic compounds in EPA Region 5. J Expo Anal Environ Epidemiol 1999;9:381–92. Edwards RD, Schweizer C, Jantunen M, Lai HK, Bayer-Oglesby L, Katsouyanni K, et al. Personal exposures to VOC in the upper end of the distribution — relationships to indoor, outdoor and workplace concentrations. Atmos Environ 2005;39:2299–307. Finley BL, Paustenbach DJ. The benefits of probabilistic exposure assessment: three case studies involving contaminated air, water, and soil. Risk Anal 1994;14:53–73. Finley B, Proctor D, Scott P, Harrington N, Paustenbach D, Price P. Recommended distributions for exposure factors frequently used in health risk assessment. Risk Anal 1994;14:533–51. Foster SJ, Kurtz JP, Woodland AK. Background indoor air risks at selected residences in Denver Colorado; 2003. http://www.envirogroup.com/publications.php. Accessed 12 June 2007. Haas CN. Importance of distributional form in characterizing inputs to Monte Carlo risk assessments. Risk Anal 1997;17:107–13. Hamed MM, Bedient PB. On the effect of probability distributions of input variables in public health risk assessment. Risk Anal 1997;17:97–105. Hodgson AT, Levin H. Volatile organic compounds in indoor air: a review of concentrations measured in North America since 1990. Berkeley, CA: Lawrence Berkeley National Laboratory; 2003. Report LBNL-51715. Jia C, Batterman S, Godwin C. Continuous, intermittent and passive sampling of airborne VOCs. J Environ Monit 2007;9:1220–30. Larsen RI. A new mathematical model of air pollutant concentration averaging time and frequency. J Air Pollut Control Assoc 1969;19:24–30. Le HQ, Batterman SA, Wahl RL. Reproducibility and imputation of air toxics data. J Environ Monit 2007;9:1358–72. Lide DR, editor. CRC Handbook of Chemistry and Physics. Boca Raton, FL: CRC Press; 2005. Loh MM, Levy JI, Spengler JD, Houseman EA, Bennett DH. Ranking cancer risks of organic hazardous air pollutants in the United States. Environ Health Perspect 2007;115:1160–8. Mohamed M, Kang D, Aneja V. Volatile organic compounds in some urban locations in United States. Chemosphere 2002;47:863–82. NRC (National Research Council). Human Exposure Assessment of Airborne Pollutants: Advances and Opportunities. Washington, DC: National Academy of Sciences; 1991. Nicas M, Jayjock M. Uncertainty in exposure estimates made by modeling versus monitoring. AIHAJ 2002;63:275–83. Nieuwenhuijsen M, Paustenbach D, Duarte-Davidson R. New developments in exposure assessment: the impact on the practice of health risk assessment and epidemiological studies. Environ Int 2006;32:996–1009. Ott WR. Environmental Statistics and Data Analysis. Boca Raton, FL: CRC Press, Inc.; 1995. Ott WR. Testing the Validity of the Lognormal Probability Model: Computer Analysis of Carbon Monoxide Data from U.S. Cities. Washington, DC: US Environmental Protection Agency; 1979. EPA-600/4-79-040. Park JS, Ikeda K. Exposure to the mixtures of organic compounds in homes in Japan. Indoor Air 2004;14:413–20. Payne-Sturges DC, Burke TA, Breysse P, Diener-West M, Buckley TJ. Personal exposure meets risk assessment: a comparison of measured and modeled exposures and risks in an urban community. Environ Health Perspect 2004;112:589–98. Raw GJ, Coward SKD, Brown VM, Crump DR. Exposure to air pollutants in English homes. J Expo Anal Environ Epidemiol 2004;14:S85–94. Roberts EM. Review of statistics of extreme values with applications to air quality data. Part I. Review. J Air Pollut Control Assoc 1979a;29:632–7. Roberts EM. Review of statistics of extreme values with applications to air quality data. Part II. Application. J Air Pollut Control Assoc 1979b;29:733–40. Saarela K, Tirkkonen T, Laine-Ylijoki J, Jurvelin J, Nieuwenhuijsen M, Jantunen M. Exposure of population and microenvironmental distributions of volatile organic compound concentrations in the EXPOLIS study. Atmos Environ 2003;37:5563–75. Sack TM, Steele DH, Hammerstrom K, Remmers J. A survey of household products for volatile organic-compounds. Atmos Environ Part A — Gen Topics 1992;26:1063–70. Sexton K, Selevan S, Wagener D, Lybarger J. Estimating human exposures to environmental pollutants: availability and utility of existing databases. Arch Environ Health 1992;47:398–407. Sexton K, Adgate JL, Ramachandran G, Pratt GC, Mongin SJ, Stock TH, et al. Comparison of personal, indoor, and outdoor exposures to hazardous air pollutants in three urban communities. Environ Sci Technol 2004;38:423–30. Sielken RL, Valdez-Flores C. Probabilistic risk assessment's use of trees and distributions to reflect uncertainty and variability and to overcome the limitations of default assumptions. Environ Int 1999;25:755–72. Singpurwalla ND. Extreme values from a lognormal law with applications to air pollution problems. Technometrics 1972;14:703–11. Thompson KM. Software review of distribution fitting programs: Crystal Ball and BestFit Add-In to @RISK. Hum Ecol Risk Assess 1999a;5:501–8.
C. Jia et al. / Environment International 34 (2008) 922–931 Thompson KM. Developing univariate distributions from data for risk analysis. Hum Ecol Risk Assess 1999b;5:755–83. US EPA (US Environmental Protection Agency). Guidelines for Exposure Assessment (FRL-4129-5), vol. 104. Washington, DC: Federal Register; 1992. p. 22888–938. US EPA (US Environmental Protection Agency). Guidance for risk characterization. Washington, DC: Science Policy Council; 1995. US EPA (US Environmental Protection Agency). Guiding Principles for Monte Carlo Analysis. Washington, DC: Risk Assessment Forum; 1997. EPA/630/R-97/001. US EPA (US Environmental Protection Agency). Supplementary Guidance for Conducting Health Risk Assessment of Chemical Mixtures. Washington, DC: Risk Assessment Forum; 2000. EPA/630/R-00/002. US EPA (US Environmental Protection Agency). IRIS Database for Risk Assessment; 2007. http://www.epa.gov/iris/. Accessed 4 June 2007. Wallace LA. Correlations of personal exposure to particles with outdoor air measurements: a review of recent studies. Aerosol Sci Tech 2000;32:15–25.
931
Wallace LA. Human exposure to volatile organic pollutants: implications for indoor air studies. Annu Rev Energy Environ 2001;26:269–301. Wallace LA, Pellizzari E, Leaderer B, Zelon H, Sheldon L. Emissions of volatile organic compounds from building-materials and consumer products. Atmos Environ 1987;21: 385–93. Weisel CP, Kim H, Haltmeier P, Klotz JB. Human respiratory uptake of chloroform and haloketones during showering. J Expo Anal Environ Epidemiol 1999;15:6–16. Weisel CP, Zhang J, Turpin BJ, Morandi MT, Colome S, Stock TH, et al. The relationships of indoor, outdoor and personal air (RIOPA) study: study design, methods and initial results. J Expo Anal Environ Epidemiol 2005a;15:123–37. Weisel CP, Zhang J, Turpin BJ, Morandi MT, Colome S, Stock TH, et al. Relationships of Indoor, Outdoor, and Personal Air (RIOPA): Part I. Collection Methods and Descriptive Analyses. Houston,TX: Health Effects Institute, Boston, MA and National Urban Air Toxics Research Center; 1983b. http://pubs.healtheffects.org/view.php? id=31. Accessed 18 October 2007.
Journal of Exposure Science and Environmental Epidemiology (2009) 19, 248–259 © 2009 Nature Publishing Group All rights reserved 1559-0631/09/$32.00
www.nature.com/jes
Predictors of personal air concentrations of chloroform among US adults in NHANES 1999–2000 ANNE M. RIEDERER a , SCOTT M. BARTELL a,b AND P. BARRY RYAN a b
a
Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA Program in Public Health, University of California, Irvine, California, USA
Volunteer studies suggest that showering/bathing with chlorinated tap water contributes to daily chloroform inhalation exposure for the majority of US adults.We used data from the 1999–2000 US National Health and Nutrition Examination Survey (NHANES) and weighted multiple linear regression to test the hypothesis that personal exposure microevents such as showering or spending time at a swimming pool would be significantly associated with chloroform levels in 2–3 day personal air samples. The NHANES data show that eight of 10 US adults are exposed to detectable levels of chloroform. Median (1.13 µg/m3), upper percentile (95th, 12.05 µg/m3), and cancer risk estimates were similar to those from recent US regional studies. Significant predictors of log personal air chloroform in our model (R 2=0.34) included age, chloroform concentrations in home tap water, having no windows open at home during the sampling period, visiting a swimming pool during the sampling period, living in a mobile home/trailer or apartment versus living in a single family (detached) home, and being Non-Hispanic Black versus Non-Hispanic White, although the race/ethnicity estimates appear influenced by several outlying observations. Reported showering activity was not a significant predictor of personal air chloroform, possibly due to the wording of th e NHANES shower question. The NHANES measurements likely underestimate true inhalation exposures since subjects did not wear sampling badges while showering or swimming, and because of potential undersampling by the passive monitors. Research is needed to quantify the potential difference. Journal of Exposure Science and Environmental Epidemiology (2009) 19, 248–259; doi:10.1038/jes.2008.7; published online 12 March 2008
Keywords: chloroform personal air inhalation risk.
Introduction Chloroform is a colorless, volatile liquid that is sparingly soluble in water and moderately lipophilic (Lide, 1996). Natural sources including sea water and soil processes account for 90% of emissions (Keene et al., 1999) while anthropogenic sources include releases from drinking water and wastewater treatment, certain industrial processes, cooling towers, and swimming pools (McCulloch, 2003). Most chloroform in the environment partitions to air, with the global average atmospheric concentration estimated to be 73 ng/m3 (McCulloch, 2003). The present work was performed at the Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University. 1. Abbreviations: AER, air exchange rate; CalEPA, California Environmental Protection Agency; CDC, US Centers for Disease Control and Prevention; CI, confidence interval; DHHS, US Department of Healtha nd Human Services; NHANES, US National Health and Nutrition Examination Survey; RfD, reference dose; TEAM, Total Exposure Assessment Methodology; EPA, US Environmental Protection Agency 2. Address all correspondence to: Dr. Anne M. Riederer, Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA 30322, USA. Tel.: þ404 712 8458. Fax: þ404 727 8744. E-mail:
[email protected] Received 26 October 2007; accepted 24 January 2008; published online 12 March2008
In mammals, inhaled chloroform is metabolized in the liver, kidney, and nasal mucosa to trichloromethanol, which degrades to phosgene (US Environmental Protection Agency (EPA, 2001a)). Phosgene reacts with nucleophilic groups on enzymes and proteins to form cytotoxic adducts (EPA, 2001a). There is no current evidence of long-term bioaccumulation in humans (EPA, 2001a). Although an inhalation reference concentration has not been published, EPA published an oral reference dose of 0.01 milligrams per kilogram body weight per day (mg/kg-d) based on animal evidence of hepatotoxicity (EPA, 2007). EPA classifies chloroform as a probable human carcinogen based on animal studies showing that inhalation or ingestion at cytotoxic doses produces hepatic and renal neoplasia (EPA, 2001a). EPA has published an inhalation unit risk of 2.3 × 10-5 per µg/m3 and estimated air concentrations of 4 µg/m3, 4 × 10-1 µg/m3, and 4 × 10-2 µg/m3 at the 1 in 104, 1 in 105, and 1 in 106 cancer risk levels, respectively (EPA, 2007). The California Environmental Protection Agency (CalEPA) published an inhalation unit risk of 5.3 × 10-6 per µg/m3 (CalEPA, 2002). There is limited evidence for mutagenicity or reproductive effects at doses below those causing systemic toxicity (EPA, 2001a). Chlorinated water is thought to be the primary source of non-occupational chloroform exposure among US adults (Nieuwenhuijsen et al., 2000; Wallace, 2001). Chloroform is
Riederer et al: Reprinted by permission from Macmillan Publishers Ltd: Journal of Exposure Science & Environmental Epidemiology, 19(3): 248-259, 2009.
Predictors of chloroform in personal air among US adults
formed in treated water by the reaction of chlorine with humic acids and other organic material. Concentrations vary by region, day, and time with reported levels ranging from below detection to maximum values of 100–200 mg/l (Clayton et al., 1999; Backer et al., 2000; Kerger et al., 2000; Lynberg et al., 2001; Gordon et al., 2006). Bench-scale experiments have shown that heating tap water gradually (as in a hot water heater) or boiling can affect point of use levels. Weisel and Chen (1994) recorded up to twofold increases in tap water chloroform after heating from 25–651C for 30 min, presumably from increased formation reactions among free chlorine and dissolved organic constituents. Krasner and Wright (2005) on the other hand hypothesized that simultaneous formation and volatilization were responsible for the 34% decrease they observed in tap water chloroform after boiling for one minute. Chloroform’s volatility and ubiquitous presence in tap water may help explain why it is frequently detected in personal air (e.g., air sampled from the breathing zone of subjects) and indoor air at concentrations 10–100-fold higher than outdoor levels. The EPA TEAM (Total Exposure Assessment Methodology) studies showed consistently higher levels in personal than outdoor air in 24-h samples collected from over 1,500 subjects in four states (Wallace, 1987). Recently, Weisel et al. (2005) measured higher concentrations in 48-h samples of personal air (adult median 1.04 mg/m3) and indoor air (median 0.92 mg/m3) than colocated outdoor samples (median 0.17 mg/m3) from 300 homes in Los Angeles, Elizabeth, and Houston. Other US researchers have found similar ratios of personal to indoor and outdoor levels (Clayton et al., 1999; Payne-Sturges et al., 2004; Sexton et al., 2004a). While these studies illustrate the greater exposure potential of personal and indoor air versus outdoor air, less is known about which activities and microenvironments contribute the largest fraction of daily inhalation intake. US adult volunteer studies point to showering and/or bathing with chlorinated tap water as a major contributor to daily inhalation exposures. Gordon et al. (2006) found a 440-fold increase in bathroom air chloroform after subjects took hot showers in their study of household water use activities by seven volunteers. Kerger et al. (2000) found that bathroom air chloroform increased 3 and 1 mg/m3 during showering and bathing respectively for each mg/l chloroform in water. Using a mass balance approach, water use data, and exposure factor assumptions including 1 mg/l tap water chloroform, McKone (1987) estimated that showering contributes up to 50% of lifetime chloroform inhalation exposures for the average US adult versus spending time in the bathroom or remainder of the house. Additionally, recent biomarker studies of US adults show that showering/bathing is significantly associated with increases in breath and/or blood chloroform, while other household water use activities such as washing dishes or clothes are not (Weisel et al., 1999; Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Riederer et al.
Backer et al., 2000; Lynberg et al., 2001; Nuckols et al., 2005; Xu and Weisel, 2005). Swimming in chlorinated pools is also associated with elevated biomarker concentrations though most studies have been conducted outside the United States (Lindstrom et al., 1997; Le´vesque et al., 2000; Erdinger et al., 2004; Caro and Gallego, 2007). We used multiple linear regression to investigate the major predictors of chloroform in personal air in the NHANES 1999–2000 VOC (volatile organic compound) Subsample (US Centers for Disease Control and Prevention (CDC, 2007a)). The NHANES data, which include chloroform concentrations in personal air and household tap water in addition to socioeconomic data and information on activity patterns, provide a unique opportunity to evaluate predictors of inhalation exposures in a nationally representative sample. We hypothesized that personal exposure microevents such as showering/bathing and/or spending time at a pool would be significantly associated with chloroform concentrations in personal air while associations with other exposure factors would not. We also compared personal air levels to EPA’s inhalation unit risk values to evaluate the distribution of cancer risk at the national level and among key subgroups.
Methods NHANES Data Collection Detailed methods are available at the NHANES website (CDC, 2001). Briefly, a random subsample of subjects aged 20–59 was recruited to participate in the VOC study during the NHANES medical examination. Consenting subjects wore passive VOC exposure badges (3Mt Organic Vapor Monitor 3520, 3M Corporation, St Paul, MN, USA) continuously for 46–76 h after the examination. Subjects were instructed to wear it on the upper chest, leave it on a bedside table or clipped to a nearby lampshade while sleeping, and leave it in an adjacent room while showering since humidity affects readings. Subjects were also asked to record hours spent indoors at home, indoors at work/school, and outdoors using an activity log, and instructed to collect a tap water sample from a bathtub or an outside faucet in an NHANES-provided container (CDC, 2001). When subjects returned their samples, an NHANES interviewer administered a brief questionnaire to collect information on VOC exposure-related activities (CDC, 2001). Home examiners interviewed and collected samples from subjects who could not return to the trailer within 46–76 h; samples collected outside this window were considered invalid. Samples were analyzed at CDC or contract laboratories. Badge measurements below the analytical detection limit were replaced with the detection limit, adjusted for badge wearing minutes, divided by O2 (CDC, 2005a). Water measurements below detection were replaced with the detection limit divided by O2. Although badge field 249
Riederer et al.
duplicates, field blanks, and positive controls were collected, results for these quality control samples were not available in the NHANES public release data. In addition to tap water chloroform, we considered 30 NHANES variables potential predictors of chloroform inhalation exposure. Of these, 17 were from the VOC Questionnaire (CDC, 2001), eight from the Demographic Questionnaire (CDC, 2005b), and three from the Housing Characteristics Questionnaire (CDC, 2005c). Another, body mass index, was recorded during the NHANES examination (CDC, 2005d). We considered the variable indicating whether subjects participated in morning, afternoon or evening examination sessions (CDC, 2005d) a proxy for the time of day subjects began wearing the badge. We downloaded the relevant data sets from the NHANES website (CDC, 2005a, d–f, 2007a, b) and used the NHANES VOC Subsample weights (WTSVOC2Y) as well as the stratum (SDMVSTRA) and cluster (SDMVPSU) variables available in the NHANES 1999–2000 demographic data for weighted statistical analyses. Certain NHANES data were updated after their initial public release; all used in the present study were updated as of June 2007.
Variable Recodes We preserved the NHANES categorical variable groupings but recoded them so the group with the highest weighted frequency in the VOC subsample was the reference group. Minor recodes included combining the ‘‘something else’’ and ‘‘dorm’’ responses to the NHANES type of home question into one category and transforming badge wearing minutes to hours. We treated household income (INDHHINC) as a continuous variable using the NHANES numerical categories (1–11) instead of their corresponding income ranges. NHANES included two additional income categories (12, 4$20,000 and 13, o$20,000) to minimize refused/don’t know responses. We recoded Category 12 responses as missing; this affected 3.2% of subjects. There were no Category 13 responses. We developed a new occupation variable to identify subjects with workplace exposure potential. NHANES Question OCD230 asked subjects the industry they worked in while Question OCD240 asked the type of work they performed. We created a variable (‘‘occupation’’) with four response categories: 0 F other; 1 F food preparation/store/ restaurant; 2 F manufacturing (paper, chemicals, food, electrical/transport equipment); 3 F construction, and; 4 F no industry/job recorded. We considered Categories 1–3 to have workplace exposure potential based on information from the 11th Report on Carcinogens (US Department of Health and Human Services (DHHS, 2005)) and industries reporting 410,000 lb annual chloroform releases during 1999–2000 to the EPA Toxic Release Inventory Program (EPA, 2001b, 2002). We considered other industries/jobs (Category 0) to have limited exposure potential. 250
Predictors of chloroform in personal air among US adults
Category 1 includes subjects who reported ‘‘retail-food stores’’ or ‘‘retail-eating/drinking places’’ in response to Question OCD230, as well as subjects who reported ‘‘cooks’’ or ‘‘miscellaneous food preparation/service’’ to Question OCD240. We assumed these workers would spend part of the day in a kitchen around water use activities. If a subject said s/he worked in food preparation but as a waitress/waiter, we coded her/him as Category 0, assuming s/he spent less time around water than cooks or dishwashers for example. Category 2 includes workers in food/kindred products, paper products/printing/publishing, chemicals/petroleum/coal products, or transportation equipment industries. Textile/apparel/furnishings machine operators were also included in Category 2 since one author (A. Riederer) observed extensive water use on visits to US textile mills in the 1990s, and since the textile response category to Question OCD230 applied to finished products which we assumed do not require as much water to manufacture as unfinished cloth. Category 3 includes subjects who reported working in construction (Question OCD230) and/or in construction trades (Question OCD240).
Exploratory Data Analysis Of the 851 subjects selected, 669 completed the VOC sampling protocol. Subsample weights were adjusted by CDC for non-response, and to match projected Census 2000 counts, and sum to 150,249,991 (CDC, 2006). We calculated weighted response frequencies and 95% confidence intervals (95% CIs) using PROC SURVEYFREQ in SAS 9.1 (SAS Institute, Cary, NC, USA). We also conducted exploratory analysis on the weighted and unweighted continuous variables. Distributions of raw and log-transformed data were visually evaluated for normality and outliers. Variables with histograms appearing right-skewed were log-transformed for the regressions. We evaluated colinearity between continuous predictors using simple scatter plots. Last, we calculated weighted cumulative percentiles of personal air chloroform and 95% CIs for the percentile estimates using the DESCRIPT procedure in SUDAAN 9.0.0 (Research Triangle Institute, Research Triangle Park, NC, USA). Regression Modeling and Diagnostics We conducted weighted regression modeling in SUDAAN PROC REGRESS, using the NHANES fill-in values for measurements below detection. Model building was conducted by first performing univariate regressions of logtransformed chloroform badge concentrations on each of the 31 initial predictors. We also included a quadratic term for badge wearing time to account for potential non-linearity in response. Predictors with p-values of 0.2 or less were retained for the multivariable analysis. These were assigned a random number and added one-by-one in ascending order to a multivariable model fitted using PROC REGRESS. Predictors with Pr0.2 were retained in each subsequent step. Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Predictors of chloroform in personal air among US adults
Riederer et al.
We fit the final model and manually removed predictors with p40.05 until all remaining predictors had Pr0.05, our criterion for statistical significance. We evaluated model assumptions of normality and homoscedasticity by examining plots of predicted values versus residuals as well as histograms and normal probability plots of residuals. Model fit was evaluated using the R2 statistic. Following Korn and Graubard (1998), we examined partial regression plots to identify potentially influential observations then compared parameter estimates in the full model versus a model with each influential observation excluded. Influential observations were excluded one at a time in these analyses.
badge at all times (88.2%) and taking a hot shower for Z5 min (85.9%). Half (55.4%) reported having windows open at home, and/or breathing fumes from/using air fresheners/room deodorizers (47.4%) and/or disinfectant/ degreasing cleaners (39.5%). Less than a third responded yes to other chloroform-related items on the VOC Questionnaire. Only 8.8% reported visiting a pool. The median badge wearing hours was 53.6 and no subject wore her/his badge o28 h. Median hours spent indoors at home, indoors at work/school, and outdoors were 29.9, 7.8 and 5.7, respectively. Median chloroform in water was 13.7 ng/ml (7.0–19.3 ng/ml, 95% CI) while the 95th percentile was 74.7 ng/ml (50.6–112.9 ng/ml, 95% CI).
Cancer Risk Estimates We estimated lifetime excess cancer risk for individual subjects by multiplying her/his badge concentration by EPA’s chloroform inhalation unit risk (EPA, 2005). This method estimates an individual’s upper-bound risk of developing cancer over a lifetime (70 years) of exposure at the measured concentration. We estimated population risk in units of excess cancer cases by multiplying each subject’s individual excess risk by her/his NHANES sample weight, then summing across the total population or subgroup. To evaluate the distribution of risk burden within subgroups, we calculated the weighted percent of each subgroup at the Z1 in 104, 1 in 106–1 in 105, and r1 in 106 individual risk levels. We considered subgroups with higher proportions of people at the Z1 in 104 risk level to bear a greater cancer burden than subgroups with fewer at that level. For comparison, we repeated these calculations using the CalEPA inhalation unit risk (CalEPA, 2002).
Distribution of Personal Air Chloroform Figure 1 shows the weighted cumulative distribution of personal air chloroform in NHANES 1999–2000. Median and 95th percentile levels were 1.13 mg/m3 (0.93–1.39 mg/m3, 95% CI) and 12.05 mg/m3 (8.12–13.54 mg/m3, 95% CI), respectively. The maximum concentration was 53.9 mg/m3. Figure 1 also shows the detection limits and the EPAestimated air concentrations at the 1 in 105 and 1 in 104 cancer risk levels. Detection limits varied for each badge depending on wearing duration, with those worn longer having lower limits than those worn for shorter periods. All measurements at or below the 1 in 105 risk level (0.4 mg/m3), corresponding to 13% of US adults, were below detection. All measurements at or above 0.55 mg/m3 were in the detectable range. Approximately 59% (6% of US adults) of values in the 0.41–0.55 mg/m3 range were below detection while 41% (4% of US adults) of values in this range were detectable. The majority (62%) of US adults had measurements at the 1 in 105–1 in 104 risk level while 19% had values exceeding the 1 in 104 risk level.
Results Weighted Detection and Response Frequencies Chloroform was measured at levels at or above detection limits in 77.2% of badge and 80.1% of water samples. Measurements were below detection in 20.0% of badge and 15.3% of water samples, while 2.8 and 4.6% of badge and water samples respectively were missing. One water measurement exceeded the upper bound of the calibrated range of the analytical method but by o20% thus we included it in our regressions; excluding it did not change statistical outcomes. Table 1 shows weighted response frequencies and descriptive statistics for the regression predictors. Missing responses ranged from 0–4.2% while refusals or ‘‘don’t know’’ (not shown) accounted for o1% of responses. Household income (not shown) was missing for 8.4% of subjects, while the three most commonly reported categories were $25,000–34,999 (11.7%), $55,000–64,999 (10.4%), and Z$75,000 (22.7%). Most subjects (48.8%) participated in the morning NHANES examination. A majority reported wearing the Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Significant Predictors of Personal Air Chloroform Predictors eliminated by the univariate screen included: wore badge at all times, education, body mass index, new carpets, hours indoors at work/school, hours outdoors, took hot shower for Z5 min, in dry cleaning shop/drycleaned clothes, near wood-burning, breathed fumes from/used dry cleaning fluid/spot remover, and breathed fumes from/used glues/ adhesives hobbies/crafts. Predictors eliminated during multivariable modeling included: badge wearing hours, examination session, gender, occupation, income, wear respirator at work, wear gloves at work, number of rooms in the home, hours indoors at home, use home water treatment devices, store paints/fuels inside home, and breathe fumes from/use paint, disinfectant/degreasing cleaners, and air fresheners/ room deodorizers. Diagnostic plots suggested that model assumptions of normality and homoscedasticity were valid. A maximum of 13 parameters were estimable in the final fitted model. Table 2 summarizes the regression coefficients (bs) for predictors 251
252
Table 1. Weighted response frequencies and descriptive statistics of chloroform inhalation exposure predictors in the NHANES 1999–2000 VOC Subsample. NHANES code
Exam session (PHDSESN)
¼ missing/don’t know 0 ¼ morning 1 ¼ afternoon 2 ¼ evening
Demographic Age (RIDAGEYR)
Gender (RIAGENDR)
Race/ethnicity (RIDRETH1)
Body mass index (BMXBMI)
Housing Type of home (HOD010)
Number of rooms in home (HOD050)
Source of tap water in home (HOQ070)
F 48.8 29.6 21.7 F 100
¼ missing/don’t know 1 ¼ male 2 ¼ female
F 48.6 51.4
¼ missing/don’t know 1 ¼ Mexican American 2 ¼ Other Hispanic 3 ¼ Non-Hispanic White 4 ¼ Non-Hispanic Black 5 ¼ Other Race
F 7.3 7.8 68.8 11.7 4.4
¼ missing/don’t know 1 ¼ o high school 2 ¼ high school diploma 3 ¼ 4 high school
F 20.0 26.4 53.6
¼ missing number
0.2 99.8
¼ missing 1 ¼ mobile home/trailer 2 ¼ 1 family, detached 3 ¼ 1 family, attached 4 ¼ apartment 5 ¼ something else 6 ¼ dorm
0.6 6.9 61.5 6.1 23.2 1.0 0.6
¼ missing number (1–12) 13 ¼ 13 or more ¼ missing 1 ¼ private/public water
0.6 98.7 0.5 0.6 83.9
Predictor (NHANES ID)
% Missing
Median
95th percentile
Range
Time-activity patterns Badge wearing hours (LBAVOCSD-h)
0.9
53.6
72.3
34–190
Hours indoors at home (VTQ090)
3.1
29.9
49.5
2–70
Hours indoors work/school (VTQ110)
3.1
7.8
24.6
0–45
Hours outdoors (VTQ120)
3.1
5.7
23.4
0–40
% Missing
% Yes
% No
Wear badge at all times (VTQ015)
4.2
88.2
7.6
Any windows open at home (VTQ100)
3.1
55.4
41.2
Any time at swimming pool (VTQ140)
3.4
8.8
87.8
In drycleaning shop, drycleaned clothes (VTQ150)
3.4
14.2
82.1
Near wood-burning fire 10 min or longer (VTQ160)
3.4
8.9
87.4
Hot shower for 5 min or longer (VTQ180)
3.4
85.9
10.5
Breathe fumes from/use: Paint (VTQ200A)
3.7
10.0
86.2
Disinfectant/degreasing cleaners (VTQ200C)
3.8
39.5
56.7
Air fresheners/room deodorizers (VTQ200J)
3.8
47.7
48.2
Drycleaning fluid/spot remover (VTQ200K)
3.8
6.3
89.9
Glues/adhesives, hobbies crafts (VTQ200L)
3.8
8.8
87.4
3.1
18.3
78.6
Personal exposure microevents
New carpets home/work past 6 months (VTQ070)
Predictors of chloroform in personal air among US adults
Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Highest level education (DMDEDUC)
¼ missing/don’t know number (20–59)
%
Riederer et al.
Predictor (NHANES ID)
6.4
22.5 4 ¼ none recorded
Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Riederer et al.
Cumulative percentile
‘‘No’’ category includes subjects answering no to NHANES Question OCQ300, ‘‘Ever wear protective equipment at work,’’ and missing responses to NHANES Question OCD230, ‘‘What kind of business or industry is this?’’.
a
80.8 19.2 F Ever wear gloves (ex.for cold) at work (OCQ310C_recoded)a
97.9 2.2 F Ever wear respirator at work (OCQ310A recoded)a
73.1 25.9 0.6 Occupation Type of industry/job (recoded from OCD230 and OCD240)
0 ¼ other 1 ¼ food prep/store/rest 2 ¼ manuf-paper/chem food/ elec/transp equip 3 ¼ construction
62.0 5.1 3.9
Use home water treatment devices (HOQ080)
59.7 15.3 0.0 2 ¼ private/public well 3 ¼ something else
37.2 3.1 Store paints/fuels inside home (VTQ080)
% Missing Predictor (NHANES ID) % NHANES code Predictor (NHANES ID)
Table 1. Weighted response frequencies and descriptive statistics of chloroform inhalation exposure predictors in the NHANES 1999–2000 VOC Subsample.
% Yes
% No
Predictors of chloroform in personal air among US adults
100 All values 95 < 0.4 90 3 µg/m 85 below 80 analytical 75 detection 70 limit 65 60 55 50 45 40 35 30 25 20 15 10 5 0 0.1 0.4
59% of values 0.41– 0.55 µg/m3 below detection limit
1
4
10
1 in 10 5risk level 1 in 10 4risk level Personal air chloroform (µg/m3)
Figure 1. Weighted cumulative percentiles of personal air chloroform among US adults (age 20–59) in NHANES 1999–2000 (dotted lines denote weighted lower and upper 95% confidence intervals for percentile estimates; dark shading indicates values above EPA 1 in 104 cancer risk level, light shading indicates values between 1 in 105 and 1 in 104 risk levels).
significant at the a ¼ 0.05 level in the final model (multiple R2 ¼ 0.34). Chloroform in home tap water was a significant predictor of log personal air chloroform with a coefficient of 0.016 (Po0.0001). Having no windows open at home (b ¼ 0.413, P ¼ 0.0007) and visiting a swimming pool (b ¼ 0.523, P ¼ 0.0102) were also associated with elevated levels. Certain home types were associated with elevated levels relative to the reference group (single family, detached): mobile home/trailer (b ¼ 0.684, P ¼ 0.0204), apartment (b ¼ 0.507, P ¼ 0.0045), and dormitory/something else (b ¼ 0.580, P ¼ 0.0118). Removal of one influential observation reduced the dormitory/something else coefficient by 28% and increased the P-value to 0.0762. Certain race/ethnicity groups were also associated with elevated levels compared to the reference group (NonHispanic White): Other Hispanic (b ¼ 0.535, P ¼ 0.0460), and Non-Hispanic Black (b ¼ 0.260, P ¼ 0.0437). Removal of three influential observations changed the P-values in the Non-Hispanic Black category to 0.0528, 0.0588, and 0.0609, respectively, but did not change the coefficients by 410%. Removal of two others changed the P-values for the Other Hispanic category to 0.0645 and 0.0651, respectively, but did not change the coefficient by 47%. Removal of another lowered the Other Hispanic category coefficient by 12% and increased the P-value for the Non-Hispanic Black category 253
Predictors of chloroform in personal air among US adults
Riederer et al.
Table 2. Multivariable weighted regression (multiple R2 ¼ 0.34) of log personal air chloroform on demographic and exposure microevent predictors among US adults in NHANES 1999–2000.
Predictor
b
Lower 95% limit
SE
Upper 95% limit
P-value (b ¼ 0)
Home tap water chloroform ( mg/l) No windows open at home (vs. any) Any time at swimming pool (vs. none)
0.016 0.413 0.523
0.003 0.095 0.176
0.010 0.209 0.145
0.021 0.617 0.902
o0.0001 0.0007 0.0102
Home type (vs. single family detached) Mobile home or trailer Apartment Dormitory or something else
0.684 0.507 0.580
0.262 0.150 0.201
0.123 0.186 0.150
1.245 0.829 1.010
0.0204 0.0045 0.0118
0.535 0.260 0.008
0.244 0.117 0.003
0.011 0.009 0.016
1.059 0.511 0.001
0.0460 0.0437 0.0282
Race/ethnicity (vs. Non-Hispanic white) Other Hispanic Non-Hispanic black Age
to 0.0708. Last, increasing age was a significant predictor of decreasing log badge chloroform (b ¼ 0.008, P ¼ 0.0283).
Distribution of Excess Cancer Risk The weighted median individual excess cancer risk calculated using the EPA inhalation unit risk value was 2.6 105. (2.1 105–3.2 105, 95% CI), whereas the 90th percentile was 1.3 104 (1.0 104–1.9 104, 95% CI). Using the CalEPA value, the median was 6.0 106 (5.0 106– 7.4 106, 95% CI) and the 90th percentile was 3.0 105 (2.3 105–4.4 105, 95% CI). Table 3 summarizes the excess cancer risk across key subgroups. On a total population level, chloroform inhalation was estimated to account for 9,197 cancer cases (or 2,119 cases using the CalEPA value). US adults with detectable tap water chloroform accounted for 8,453 excess cases versus 477 cases among those without detectable tap water levels. Having windows closed at home during sampling was associated with an additional 1,235 cases versus having windows open. Although visiting a pool was a significant predictor in the regressions, non-swimmers accounted for the larger burden of cancer risk, with over 7,700 estimated cases. People living in single family, detached homes and apartments accounted for the majority of excess cases by home type (3,947 and 3,539 cases, respectively) while Non-Hispanic Whites accounted for the greatest number of cases (45,000) among race/ethnic groups. Table 3 also shows the weighted relative fraction of each subgroup falling in risk ranges of Z1 in 104 and 1 in 106–1 in 104. All individual risk estimates exceeded 1 in 106 regardless of whether EPA or CalEPA inhalation unit risks were used. Using the EPA value, 15.5% of the US adult population exceeded the 1 in 104 risk level while 81.6% fell in the 106– 104 range. Using the CalEPA value, the fractions were shifted to 1.3% and 95.8%, respectively. 254
The subgroup with detectable home tap water chloroform had a higher fraction of people at the Z1 in 104 risk level than the subgroup without detectable levels F 18.4% versus 3.4% respectively using the EPA inhalation unit risk, and 1.7% versus 0% using the CalEPA inhalation unit risk. Other subgroups with high (i.e., 420%) relative fractions at the Z1 in 104 risk level included the closed windows subgroup (22.6%), swimmers (20.6%), apartment dwellers (29.7%), dorm/something else dwellers (23.1%), Other Hispanics (32.6%), and Non-Hispanic Blacks (24.5%). With the CalEPA value, these estimates dropped to 3.0% (closed windows), 7.5% (swimmers), 2.9% (apartment dwellers), 0% (dorm/something else dwellers), 4.1% (Other Hispanics), and 3.5% (Non-Hispanic Blacks).
Discussion The NHANES 1999–2000 data show that eight of 10 US adults are exposed to chloroform in personal air at levels detectable using current analytical methods. Levels were of similar magnitude as those reported in other studies of US adults. Using passive sampling badges (Pellizzari et al., 2001) similar to those used in NHANES, the Clayton et al. (1999) EPA Region 5 study detected chloroform in 68.3% of 6-day personal air samples, with median and 90th percentile levels of 1.96 and 4.54 mg/m3, respectively. The higher detection limit in this study versus NHANES may help explain the lower detection frequency. Other recent studies have similar detection limits and sampling durations as NHANES. Weisel et al. (2005) detected chloroform in 84.6, 75.6, and 93.0% of 48-h personal air samples from Los Angeles, Elizabeth, and Houston adults, respectively, using passive sampling badges. The Houston median fell within the 95% CI of the NHANES median, while the Los Angeles and Elizabeth Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
US EPA (2.3 105 mg/m3)1
Inhalation unit risk:
Total excess risk (cases)
CalEPA (5.3 106 mg/m3)1
% of subgroup at excess risk ofa:
Total excess risk (cases)
Z1 in 104 1 in 106–1 in 104 r1 in 106
Subgroup
% of subgroup at excess risk ofa:
Missing badge data (% of subgroup)
Z1 in 104 1 in 106–1 in 104 1 in 106
Total US adult population
9,197
15.5
81.6
0
2,119
1.3
95.8
0
2.8
Home tap water chloroform Not detected (o0.2 mg/l) Detected (0.2–233 mg/l)
477 8,453
3.4 18.4
95.2 78.3
0 0
110 1,948
0 1.7
98.6 95.1
0 0
1.4 3.3
Windows open at home None Any
5,216 3,981
22.6 10.6
76.2 87.0
0 0
1,202 917
3.0 0.2
95.8 97.5
0 0
1.2 2.3
Time at swimming pool None Any
7,752 1,445
15.2 20.6
83.0 76.8
0 0
1,786 333
0.8 7.5
97.5 89.9
0 0
1.8 2.6
Home type Single family, detached Single family, attached Mobile home or trailer Apartment Dorm/something else
3,947 584 876 3,539 149
9.2 18.1 16.7 29.7 23.1
87.7 78.5 82.2 68.1 68.3
0 0 0 0 0
910 135 202 815 34
0.8 1.4 0.9 2.9 0
96.1 95.2 97.9 94.9 91.4
0 0 0 0 0
3.1 3.4 1.2 2.2 8.6
Race/ethnicity Mexican American Other Hispanic Non-Hispanic White Non-Hispanic Black Other Race
490 1,222 5,377 1,710 397
11.8 32.6 12.6 24.5 12.9
86.9 63.8 84.4 74.1 82.4
0 0 0 0 0
113 282 1,239 394 92
0.8 4.1 0.8 3.5 0
97.8 92.4 96.2 95.1 95.2
0 0 0 0 0
1.3 3.6 3.0 1.4 4.8
Predictors of chloroform in personal air among US adults
Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Table 3. Excess cancer risk from chloroform inhalation among key subgroups of the US adult (age 20–59) population 1999–2000.
Weighted percentages; no subjects had excess risk o1 in 106.
a
Riederer et al.
255
Riederer et al.
medians were slightly lower. Also using passive badges, Sexton et al. (2004a) detected chloroform in 79.2% of 48-h personal air samples from 71 urban Minnesota adults at a median level similar to NHANES, though the 90th percentile was significantly lower than its NHANES counterpart.
Significant Predictors of Personal Air Chloroform Key demographic (age, race/ethnicity) and housing characteristics (type of home, chloroform concentration in home tap water), and personal exposure microevents (leaving home windows open, visiting a pool) explained 34% of the variance in log personal air chloroform in our model. Other population-based studies report similar associations between water concentrations and personal or indoor air levels (Clayton et al., 1999, Weisel et al., 1999) while volunteer studies report large short term spikes in personal or indoor air chloroform during showering/bathing and other specific household water use activities (Kerger et al., 2000; Gordon et al., 2006). Reported showering activity was not significantly associated with personal air chloroform in our model. We believe this may be an artifact of the NHANES shower question, not an indicator of a true lack of association. The question asked, ‘‘ydid you take a hot shower for five minutes or longer during this time?’’ (CDC, 2001). Because most (85.9%) subjects answered yes, there may not have been sufficient variance to detect an association in the univariate regression. We did find a significant association between spending time at a swimming pool and personal air chloroform even though a small fraction of NHANES subjects reported visiting a pool. This corroborates recent non-US studies showing elevated indoor air chloroform (range 13–647 mg/m3) at swimming pools (Le´vesque et al., 2000; Fantuzzi et al., 2001; Erdinger et al., 2004). Chloroform was measured at 145 mg/m3 in swimming pool air in one US study (Lindstrom et al., 1997) but we could not find more recent estimates in the US literature. We found a significant association between living in an apartment or mobile home/trailer and elevated personal air chloroform. Keeping home windows closed was also associated with elevated levels, illustrating the influence of air exchange rates. Although we were not able to test the interaction between home type and window status due to sample size limitations, we believe the home type association may be due to differences in air exchange rates (AER) and home volumes. Sax et al. (2004) found that AER differences explained seasonal variation in the indoor/outdoor air chloroform ratio in homes of New York City (n ¼ 46) and Los Angeles (n ¼ 40) teenagers; New York apartments sampled in winter with low AERs had the highest indoor/ outdoor ratio. Weisel et al. (2005) found higher indoor air chloroform in apartments than single family homes in Elizabeth and Los Angeles while in Houston they found higher mean but lower median levels in mobile homes versus single family homes. Mobile homes in Houston and Los 256
Predictors of chloroform in personal air among US adults
Angeles had higher median AERs than other home types, while single-family homes had higher rates in Los Angeles. We are unable to explain the association between race/ ethnicity and personal air chloroform in our analyses, which appear influenced by several observations. The NHANES analytic guidelines caution that the 1999–2000 survey represents Mexican-Americans but not Other Hispanics, thus findings for this group should be interpreted cautiously (CDC, 2002). NHANES 1999–2000 did have a similar proportion of Non-Hispanic Blacks (11.7%) as the 2000 Census (‘‘Black or African American’’, 11.4%) (Census Bureau, 2000). We only found one other study that tested the race/ethnicity-chloroform association. The EPA Region 5 study reported lower indoor air levels among Non-Hispanic Whites than other race/ethnicities though the difference was not statistically significant (i.e., P40.05) (Pellizzari et al., 1999). It is difficult to explain why Non-Hispanic Blacks would have higher chloroform exposures than Non-Hispanic Whites. This may be related to home type though we were not able to test the interaction of home type and race/ ethnicity because of small cell sizes. Nonetheless, a crude analysis of the weighted NHANES data showed that 38.6% (22.1–55.1%, 95% CI) of all Non-Hispanic Blacks lived in apartments versus 15.3% (10.5–20.0%, 95% CI) of all NonHispanic Whites.
Distribution of Cancer Risk Burden from Chloroform Inhalation All NHANES subjects had individual excess cancer risks exceeding 1 in 106, a level considered by EPA to trigger an evaluation of whether additional exposure reductions are needed (EPA, 1999). Risk estimates for values below detection were constructed using the CDC fill-in values. Risk estimates for these levels all exceeded 1 in 106. It is important to note that these are upper-bound risk estimates (EPA, 2005) and that lower-bound estimates could include zero if chloroform does not prove to be a human carcinogen. Other US studies show similar population risk estimates. Loh et al. (2007) estimated median and 90th percentile levels similar to NHANES using the CalEPA inhalation unit risk value, personal air data published since 1995, and a simulated population of office workers and non-employed adults aged 18–65 designed to match 2000 Census counts. Using the EPA inhalation unit risk, Sax et al. (2006) estimated a median of 61 excess cancer cases per million using 48-h personal air data from New York City teenagers and 8.2 per million using data from Los Angeles teenagers. The New York estimate was identical to our NHANES-based estimate of excess risk in the US adult population. Payne-Sturges et al. (2004) used the EPA inhalation unit risk and 72-h personal air data to estimate 53.3 per million excess cases for South Baltimore adults. Wallace (1991) used the EPA value, the TEAM outdoor air data, and modeled personal inhalation exposures from showering to estimate an excess Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Predictors of chloroform in personal air among US adults
risk of 7 105 from outdoor air and 5 105 from showering, levels similar to the NHANES median.
Strengths and Limitations The strength of this study is the large size and nationallyrepresentative nature of the NHANES VOC data. We faced sample size limitations nonetheless. To obtain stable P-values in weighted regression, the maximum number of estimable parameters is limited to the denominator degrees of freedom of the variance estimator (i.e., the number of primary sampling clusters minus the number of sampling strata) (Research Triangle Institute, 2001). NHANES 1999–2000 included 27 clusters and 13 strata, thus our regressions were limited to 14 estimable parameters and we were not able to test interactions between main effects. The combined NHANES 1990–2000 and 2001–2002 data has 57 clusters and 28 strata. Including the 2001–2002 VOC data when available would strengthen our analysis by increasing the number of estimable parameters, allowing testing of some interactions. It would also potentially reduce the influence of individual observations in the home type and race/ethnicity subgroups. In addition, the NHANES 1999–2000 investigation chose to use the OVM 3500 series of passive samplers to measure exposure to VOCs in general and chloroform in particular. Passive samplers for VOCs have been used extensively in both occupational and community-based settings for the last two decades. Work by Chung et al. (1999), comparing passive samplers to canister whole-air samplers suggests negative bias for the passive samplers, but further states that differences are likely less than 25%. More recently, Pratt et al. (2005) have conducted a comparison of passive samplers similar to those used in NHANES and canister whole-air samplers for VOCs and support the small bias conclusion of Chung et al. (1999). Both of these studies involved laboratory and field components, but not personal monitoring. For personal monitoring, concern has been raised that passive samplers, which rely on passive diffusion and assume an unobstructed pathway to the sample, may be adversely affected. Passive samplers of this type are subject to stagnation effects, resulting from insufficient air movement near the badge and subsequent undersampling of bulk air, as well as other mechanisms such as blockage by clothing, which may result in a low bias to the inferred exposure to VOCs. Further, effects of temperature and relative humidity can also decrease the reliability of such samplers for personal monitoring. However, Sexton et al. (2004a, b) have done extensive monitoring using the OVM 3500 series badge to measure personal exposures and have noted that personal exposure, as measured by a badge worn by the individual, normally exceeds concentrations measured at stationary locations either indoors or outdoors. While none of these studies focused specifically on chloroform, the results are likely applicable to this compound. Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Riederer et al.
Another potential limitation lies in compliance with the badge wearing protocol. A weighted total of 4.2% of subjects did not answer the NHANES question on how long they wore the sampling badge, while 7.6% said they did not wear the badge the whole time. Thus for these subjects the badge measurements may not represent the true exposures across the entire sampling period. However, because the weighted median badge concentrations for these two groups (1.34 and 1.03 mg/m3, respectively) both fall within the 95% confidence limits of the median for subjects who reported wearing the badge the whole time (0.88–1.54 mg/m3), we expect the difference to be small. A third limitation lies in the NHANES sampling protocol. Subjects were instructed not to keep badges in the bathroom while showering/bathing. Explicit instructions were not given for swimmers. Although this was the only practical option using current technology, and other studies have used identical protocols (e.g., Weisel et al., 2005), this likely resulted in measurements that underestimated personal air chloroform for subjects who showered/bathed or swam during the sampling period. Volunteer studies from (Kerger et al., 2000; Gordon et al., 2006) and others have shown significant increases in bathroom air chloroform during showering/bathing. In their study of indoor air at five Italian pools, Fantuzzi et al. (2001) found total trihalomethanes levels collected poolside that were two times those measured at other pool areas. Erdinger et al. (2004) found higher levels in samples collected 30 versus 150 cm above the water at a German pool. Thus although a badge placed in an adjacent room during a shower/bath, or near a pool during swimming, is likely to capture a fraction of the increased air chloroform resulting from those events, it is not likely to capture the total inhalation exposure of the showerer/bather or swimmer. Simple calculations based on the Gordon et al. (2006) findings may help illustrate the potential magnitude of the difference. In their small (n ¼ 7) volunteer study of indoor air chloroform during scripted household water-use activities, these researchers found a mean concentration of 2.3 mg/m3 (per mg/l in water) in bathroom air versus a median of 2.0 mg/ m3 (per mg/l) in air of an adjacent room during a 10-min hot shower. At the NHANES 1999–2000 tap water chloroform median (13.7 mg/l), this translates to a difference of 0.7 mg/ m3-h that might not be captured by the sampling badge if it were kept in an adjacent room instead of the bathroom during the showering event. This 0.7 mg/m3-h is approximately 1% of 61 mg/m3-h, or the median NHANES badge concentration (1.13 mg/m3) times the median badge-wearing hours (53.6). Repeating this calculation using the 95th percentile NHANES tap water concentration (74.7 mg/l) produces an estimate of 3.7 mg/m3-h, or 6% of the NHANES badge median expressed in mg/m3-h. Thus, a rough estimate of the amount of chloroform potentially undersampled in NHANES 1999–2000 might be 1–6% for 257
Riederer et al.
subjects taking one 10-min shower and 2–12% for subjects taking two 10-min showers, depending on tap water levels and sampling duration. Future research to quantify the difference would help produce more accurate estimates of inhalation exposures to chloroform in the general population.
Acknowledgments Supported by Grant number 2006-01 of the Mickey Leland National Urban Air Toxics Research Center (NUATRC). Ideas expressed are the authors’ and not necessarily those of NUATRC. All authors have disclosed that there exist no potential conflicts of interest regarding this manuscript.
References Backer L.C., Ashley D., Bonin M.A., Cardinali F.L., Kieszak S.M., and Wooten J.V. Household exposures to drinking water disinfection by-products: whole blood trihalomethane levels. J Expo Anal Environ Epidemiol 2000: 10: 321–326. CalEPA (California Environmental Protection Agency). Air Toxics Hot Spots Program Risk Assessment Guidelines. Part II. California EPA, Sacramento, 2002. Caro J., and Gallego M. Assessment of exposure of workers and swimmers to trihalomethanes in an indoor swimming pool. Environ Sci Technol 2007: 41: 4793–4798. CDC. NHANES 1999–2000 Addendum to the NHANES III Analytic Guidelines. DHHS, CDC, Hyattsville, MD, 2002. CDC. NHANES Data (1999–2000). DHHS, CDC, Hyattsville, MD, 2005a. CDC. NHANES 1999–2000 Family-Demographic Questionnaire. DHHS, CDC, Hyattsville, MD, 2005b. CDC. NHANES 1999–2000 Family-Housing Characteristics Questionnaire. DHHS, CDC, Hyattsville, MD, 2005c. CDC. NHANES Data (1999–2000)-Medical Examination. DHHS, CDC, Hyattsville, MD, 2005d. CDC. NHANES Data (1999–2000)-Demographics. DHHS, CDC, Hyattsville, MD, 2005e. CDC. NHANES Data (1999–2000)-Questionnaire. DHHS, CDC, Hyattsville, MD, 2005f. CDC. Analytical and Reporting Guidelines, NHANES. DHHS, CDC, Hyattsville, MD, 2006. CENTERS FOR DISEASE CONTROL AND PREVENTION (CDC). NHANES Data (1999–2000)FLab 04. Department of Health and Human Services (DHHS), CDC, Hyattsville, MD, 2007a. CDC. NHANES Data (1999–2000)FLab 21. DHHS, CDC, Hyattsville, MD, 2007b. CDC. NHANES Laboratory Procedures Manual,. DHHS, CDC, Hyattsville, MD, 2001. Census Bureau. Census 2000FPopulation by Race and Hispanic or Latino Origin for the United States. US Census Bureau, Washington, DC, 2000. Chung C.W., Morandi M.T., Stock T.H., and Afshar M. Evaluation of a passive sampler for volatile organic compounds at ppb concentrations, varying temperatures, and humidities with 24-h exposures 2. Sampler performance. Environ Sci Technol 1999: 33(20): 3666–3671. Clayton C.A., Pellizzari E., Whitmore R.W., Perritt R.L., and Quackenboss J.J. National human exposure assessment survey (NHEXAS): distributions and associations of lead, arsenic and volatile organic compounds in EPA region 5. J Expo Anal Environ Epidemiol 1999: 9: 381–392. DHHS, Public Health Service. Report on Carcinogens, 11th edn. National Toxicology Program, Washington, DC, 2005. EPA. Residual Risk: Report to Congress. EPA-453/R-99-001. EPA Office of Air Quality Planning and Standards, Research Triangle Park, NC, 1999. EPA. Toxicological Review of Chloroform. EPA/635/R-01/001. EPA, Washington, DC, 2001a. EPA. TRI 1999 Data Release. EPA, Washington, DC, 2001b. EPA. TRI 2000 Data Release. EPA, Washington, DC, 2002.
258
Predictors of chloroform in personal air among US adults
EPA. Guidelines for Carcinogen Risk Assessment. EPA/630/P-03/001F. EPA, Washington, DC, 2005. EPA (US Environmental Protection Agency). Integrated Risk Information SystemFChloroform. EPA, Washington, DC, 2007. Erdinger L., Ku¨hn K.P., Kirsch F., Feldhues R., Fro¨bel T., and Nohynek B., et al. Pathways of trihalomethane uptake in swimming pools. Int J Hyg Environ Health 2004: 207: 571–575. Fantuzzi G., Righi E., Predieri G., Ceppelli G., Gobba F., and Aggazzotti G. Occupational exposure to trihalomethanes in indoor swimming pools. Sci Tot Environ 2001: 264: 257–265. Gordon S.M., Brinkman M., Ashley D.L., Blount B.C., Lyu C., and Masters J., et al. Changes in breath trihalomethane levels resulting from household water-use activities. Environ Health Perspect 2006: 114: 514–521. Keene W.C., Khalil M.A.K., Erikson III D.J., McCulloch A., Graedel T.E., and Lobert J.M., et al. Composite global emissions of reactive chlorine from anthropogenic and natural sources. J Geophys Res 1999: 104: 8429–8440. Kerger B.D., Schmidt C., and Paustenbach D.J. Assessment of airborne exposure to trihalomethanes from tap water in residential showers and baths. Risk Anal 2000: 20: 637–651. Korn E.L., and Graubard B.I. Scatterplots with survey data. Am Stat 1998: 52: 58–69. Krasner S.W., and Wright J.M. The effect of boiling water on disinfection by-product exposure. Water Res 2005: 39: 855–864. Le´vesque B., Ayotte P., Tardif R., Charest-Tardif G., Dewailly E., and Prud’Homme D., et al. Evaluation of the health risk associated with exposure to chloroform in indoor swimming pools. J Toxicol Environ Health A 2000: 61: 225–243. Lide D.R., (ed). Handbook of Chemistry and Physics, 77th edn. CRC Press, Boca Raton, 1996. Lindstrom A.B., Pleil J., and Berkoff D.C. Alveolar breath sampling and analysis to assess trihalomethane exposures during competitive swimming training. Environ Health Perspect 1997: 105: 636–642. Loh M.M., Levy J.I., Spengler J.D., Houseman E.A., and Bennett D.H. Ranking cancer risks of organic hazardous air pollutants in the United States. Environ Health Perspect 2007: 115: 1160–1168. Lynberg M., Nuckols J., Langlois P., Ashley D., Singer P., and Mendola P., et al. Assessing exposure to disinfection by-products in women of reproductive age living in Corpus Christi, Texas, and Cobb County, Georgia. Environ Health Perspect 2001: 109: 597–604. McCulloch A. Chloroform in the environment: occurrence, sources, sinks and effects. Chemosphere 2003: 50: 1291–1308. McKone T.E. Human exposure to volatile organic compounds in household tap water: the indoor inhalation pathway. Environ Sci Tech 1987: 21: 1194–1201. Nieuwenhuijsen M.J., Toledano M., and Elliott P. Uptake of chlorination disinfection by-products: review. J Expo Anal Environ Epidemiol 2000: 10: 586–599. Nuckols J.R., Ashley D.L., Lyu C., Gordon S.M., Hinckley A.F., and Singer P. Influence of tap water quality and household water use activities on indoor air and internal dose levels of trihalomethanes. Environ Health Perspect 2005: 113: 863–870. Payne-Sturges D.C., Burke T.A., Breysse P., Diener-West M., and Buckley T.J. Personal exposure meets risk assessment: comparison of measured and modeled exposures and risks in an urban community. Environ Health Perspect 2004: 112: 589–598. Pellizzari E.D., Perritt R.L., and Clayton C.A. National human exposure assessment survey (NHEXAS): exploratory survey of exposure among population subgroups in EPA Region V. J Expo Anal Environ Epidemiol 1999: 9: 49–55. Pellizzari E.D., Smith D.J., Clayton C.A., Michael L.C., and Quackenboss J.J. An assessment of the data quality for NHEXAS-Part I: exposure to metals and volatile organic chemicals in Region 5. J Expo Anal Environ Epidemiol 2001: 11: 140–154. Pratt G.C., Bock D., Stock T.H., Morandi M., Adgate J.L., and Ramachandran G., et al. A field comparison of volatile organic compound measurements using passive organic vapor monitors and stainless steel canisters. Environ Sci Technol 2005: 39(9): 3261–3268. Research Triangle Institute. SUDAAN User’s Manual, Release 8.0. Research Triangle Institute, Research Triangle Park, NC, 2001. Sax S.N., Bennett D.H., Chillrud S.N., Kinney P.L., and Spengler J.D. Differences in source emission rates of volatile organic compounds in innercity residences of New York City and Los Angeles. J Expo Anal Environ Epidemiol 2004: 14(Suppl 1): S95–109.
Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Predictors of chloroform in personal air among US adults
Sax S.N., Bennett D.H., Chillrud S.N., Ross J., Kinney P.L., and Spengler J.D. A cancer risk assessment of inner-city teenagers living in New York City and Los Angeles. Environ Health Perspect 2006: 114: 1558–1566. Sexton K., Adgate J.L., Mongin S.J., Pratt G.C., Ramachandran G., Stock T.H., and Morandi M.T. Evaluating differences between measured personal exposures to volatile organic compounds and concentrations in outdoor and indoor air. Environ Sci Technol 2004b: 38(9): 2593–2602. Sexton K., Adgate J.L., Ramachandran G., Pratt G.C., Mongin S.J., Stock T.H., and Morandi M.T. Comparison of personal, indoor, and outdoor exposures to hazardous air pollutants in three urban communities. Environ Sci Technol 2004a: 38: 423–430. Wallace L. Human exposure to volatile organic pollutants: implications for indoor air studies. Annu Rev Energy Environ 2001: 26: 269–301. Wallace L.A. Comparison of risks from outdoor and indoor exposure to toxic chemicals. Environ Health Perspect 1991: 95: 7–13.
Journal of Exposure Science and Environmental Epidemiology (2009) 19(3)
Riederer et al.
Wallace L.A. The Total Exposure Assessment Methodology (TEAM) Study: Summary and Analysis: Volume I. EPA Office of Research and Development, Washington, DC, 1987. Weisel C.P., Kim H., Haltmeier P., and Klotz J.B. Exposure estimates to disinfection by-products of chlorinated drinking water. Environ Health Perspect 1999: 107: 103–110. Weisel C.P., Zhang J., Turpin B.J., Morandi M.T., Colome S., Stock T.H., and Spektor D.M., et al. Relationships of Indoor, Outdoor, and Personal Air (RIOPA): Part I. Collection Methods and Descriptive Analyses. Mickey Leland National Urban Air Toxics Research Center, Houston, TX, 2005. Weisel C.P., and Chen W.J. Exposure to chlorination by-products from hot water uses. Risk Anal 1994: 14: 101–106. Xu X., and Weisel C.P. Human respiratory uptake of chloroform and haloketones during showering. J Expo Anal Environ Epidemiol 2005: 15: 6–16.
259
Journal of Toxicology and Environmental Health, Part A, 72: 903–912, 2009 Copyright © Taylor & Francis Group, LLC ISSN: 1528-7394 print / 1087-2620 online DOI: 10.1080/15287390902959706
Demographic, Residential, and Behavioral Determinants of Elevated Exposures to Benzene, Toluene, Ethylbenzene, and Xylenes Among the U.S. Population: Results from 1999–2000 NHANES Elaine Symanski1, Thomas H. Stock2, P. Grace Tee1, and Wenyaw Chan3
Division of Epidemiology and Disease Control, 2Division of Environmental and Occupational Health Sciences, and 3Division of Biostatistics, University of Texas School of Public Health at Houston, Houston, Texas, USA
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
1
Volatile organic compounds (VOC) represent a broad spectrum of compounds and there is growing concern that VOC exposures, in addition to increasing risks for cancer, may be implicated in exacerbating asthma and other adverse respiratory effects. Yet little is known about exposures in the U.S. population beyond the seminal Total Exposure Assessment Methodology (TEAM) studies that were conducted by the U.S. Environmental Protection Agency (U.S. EPA) between 1979 and 1987. This investigation was carried out to evaluate the relationship between personal exposures to benzene, toluene, ethylbenzene, and xylenes (BTEX) and socioeconomic, behavioral, demographic, and residential characteristics using a subsample from the National Health and Nutrition Examination Survey (NHANES) (636 participants who represented an estimated 141,363,503 persons aged 20 to 59 yr in the United States). Personal VOC exposures were evaluated using organic vapor monitors for periods that ranged from 48 to 72 h, and participants were administered a questionnaire regarding personal behaviors and residential characteristics while wearing the monitor. Geometric mean (GM) levels were significantly higher for males for all compounds except toluene. For benzene, GM levels were elevated among smokers and Hispanics. Sociodemographic characteristics could not be evaluated simultaneously in the weighted multiple regression models with the VOC questionnaire data because of issues associated with multicollinearity. Results from the regression analyses suggest that the presence of an attached garage (BTEX), having windows closed in the home during the monitoring period (benzene, toluene), pumping gasoline (toluene, ethylbenzene, and xylenes), or using paint thinner, brush cleaner, or stripper (xylenes) results in higher exposure in the general population and confirm previous findings of studies Received 7 January 2009; accepted 27 February 2009. This research was funded by the Mickey Leland National Urban Air Toxics Research Center (NUATRC). The authors appreciate the comments of Dr. Mary Ann Smith, who provided feedback on an earlier version of this article. Address correspondence to Dr. Elaine Symanski, Division of Epidemiology and Disease Control, University of Texas School of Public Health at Houston, 1200 Herman Pressler Dr., RAS 643, Houston, TX 77030, USA. E-mail:
[email protected]
that were more regional in scope. Once the complete NHANES VOC data are released, additional study is warranted to explore whether risk factors associated with elevated VOC exposures differ in subgroups of U.S. adults, which should inform efforts to develop approaches for minimizing VOC exposures and ameliorating environmental health risks.
Included among the hazardous air pollutants (HAP) identified under Title I, Section 112 of the 1990 Clean Air Act Amendments are many volatile organic compounds (VOC) (U.S. EPA, 1990). These compounds are produced from (1) point sources such as refineries and petrochemical plants, (2) mobile sources such as combustion and evaporative emissions from gasoline and diesel vehicles, and (3) area sources such as dry cleaners and gasoline stations that contribute to ambient levels. Variability in personal exposure was an early observation in the U.S. Environmental Protection Agency (EPA)-sponsored TEAM (Total Exposure Assessment Methodology) studies, which were carried out between 1979 and 1987 on representative samples of residents in multiple locations within the United States. In these studies, specific behaviors (e.g., smoking or driving), the use of hot disinfected water and various consumer products, such as paints, adhesives, cleaning agents, deodorizers, and personal care products, and employment in specific occupations were linked to higher VOC exposure levels for particular compounds (Wallace et al., 1985; 1987; 1988; Wallace, 1987). During the decade of the 1990s, several additional population-based studies of VOC personal exposures were conducted in the United States (Clayton et al., 1999; Pellizzari et al., 1999) and Europe (Hoffmann et al., 2000; Edwards et al., 2001). The results of the German Environmental Survey (representative sample of Western Germany) (Hoffmann et al., 2000) and the EXPOLIS-Helsinki Study (representative sample of
903 Symanski et al: Reprinted by permission of the publisher (Taylor & Francis Group, http://www.informaworld.com).
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
904
E. SYMANSKI ET AL.
Helsinki, Finland) (Edwards et al., 2001) both confirmed the significant impact of environmental tobacco smoke (ETS) and automotive emissions on personal exposures to BTEX (benzene, toluene, ethylbenzene, xylenes) and related compounds; the German study also found a significant effect from specific workplace exposures. More recently, there have been a number of additional personal exposure studies utilizing convenience or targeted population samples focusing on one or more cities in the United States (Sexton et al., 2004, 2007; Kinney et al., 2002; Sax et al., 2006; Payne-Sturges et al., 2004; Phillips et al., 2005; Weisel et al., 2005) and elsewhere (Kim et al., 2002; Serrano-Trespalacios et al., 2004; Hinwood et al., 2007). Although these studies were not probability based, they were all conducted much more recently than the TEAM studies, over a similar time period (1999–2001), and should better reflect the impact of current ambient and indoor sources. Although some of these studies examined the influence of specific factors on VOC exposures, they have been limited in scope to particular regions with study designs that were not intended to allow for generalizations to the U.S. population. However, monitoring of airborne VOC personal exposures was conducted as a component of the 1999–2001 National Health and Nutrition Examination Survey (NHANES). Since NHANES is designed to evaluate the health and nutritional status of the civilian non-institutionalized U.S. population, the VOC subsample provides a unique opportunity to evaluate VOC exposures at a national level. Thus, an investigation was undertaken to evaluate the relationship between personal VOC exposures (BTEX) and socioeconomic, behavioral, demographic, and residential characteristics among a representative sample of the U.S. population, aged 20 to 59 yr.
MATERIALS AND METHODS Study Design NHANES is a cross-sectional survey using a complex multistage probability sampling design to collect data representative of the U.S. population based on age, gender, race/ethnicity, and income. Sampling and data collection procedures were reported previously and are publicly available at http://www. cdc.gov/nchs/data/nhanes. Approval for the study was granted by the NCHS Research Ethics Review Board and informed consent was obtained from all study participants. As a component of the NHANES 1999–2001 Volatile Organic Compounds Study, measurements of 10 compounds in air, blood, and tap water were performed on samples collected in a representative subsample of the NHANES study population. A questionnaire was used to collect information on personal activities and home conditions potentially related to VOC exposures during the monitoring period. The NCHS limited release of the VOC subsample results to the initial 2-yr period of the project; this subsample represented a random one-fourth or one-third sample of the total study population aged 20–59 yr participating in
NHANES during 1999 and 2000, respectively (CDC, 2005). The NCHS updated the VOC data in October 2006, and it is these data that were used in the current investigation. This study focuses its assessment on BTEX (personal air sampling measurements only). Sampling and Laboratory Methods and Administration of the VOC Exposure Monitoring Questionnaire The personal air samplers employed in the NHANES VOC subsample were 3M 3520 organic vapor monitors (OVM), which are diffusive, double-pad, charcoal-based samplers that were successfully evaluated and used in large field studies of community exposures of air toxics (Chung et al., 1999; Adgate et al., 2004; Weisel et al., 2005). Participants were given instructions on handling the OVM, which were attached to their outer garments near their breathing zone during their first visit to the Mobile Examination Center (MEC) for clinical tests. They were instructed to go back to the MEC 48–72 h later, at which time the study participants returned the OVM and were administered a questionnaire regarding personal behaviors and residential characteristics while wearing the monitor (CDC, 2001). In total, 30 questions gathered information on the following items: (1) the duration the badge was worn (n = 2 questions), (2) household (n = 6) or neighborhood (n = 1) characteristics, (3) time spent in different environments (n = 3), (4) specific activities likely to give rise to VOC exposures, such as pumping gasoline or visiting a dry-cleaning shop (n = 6), and (5) breathing fumes from different products such as paints, furniture polish, and glues or adhesives for hobbies or crafts (n = 12). All personal samples and field blanks were shipped via overnight delivery in hard-plastic coolers with freezer bags to a designated lab for analysis by gas chromatography/mass spectrometry (GC/MS). Two different labs performed these analyses based on the work of Chung et al. (1999). Over the course of the sampling, an expert review panel assessed the comparability of results between laboratories and determined the validity of the VOC measurements released in the dataset (CDC, 2005). Method detection limits (MDL) were initially calculated as 48-h concentration values and then adjusted on a sample-specific basis for the actual sample duration (CDC, 2005). According to standard practice (Hornung & Reed, 1990), the NCHS substituted sample concentrations below the corresponding MDL with a value equal to the MDL 2 ; see Lin et al. (2008) for a range of estimated MDL values for each VOC that was measured in the NHANES subsample. Study Population Among the study participants in NHANES 1999–2000 who were invited to participate in the VOC subsample (n = 851), 182 persons were excluded because they were missing sample weights due to nonparticipation, 14 were excluded because they lacked personal exposure measurements for all compounds,
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
VOC EXPOSURE IN THE U.S. ADULT POPULATION
and 11 participants were excluded because of missing responses to all questions on the VOC questionnaire. Because it was of interest to evaluate exposures over the full period in which monitoring occurred, an additional exclusion was applied based upon the percent of time the badge was actually worn. For each of the 644 participants remaining, the number of hours reported not wearing the badge was subtracted from the number of hours between the start and stop times wearing the badge and the percent of time that the badge was worn was calculated. Using a cut point of less than 75%, an additional 8 participants were excluded (time spent wearing the badge for these participants ranged from 2.3 to 37.6 h, 4 to 51% of the time, respectively). Following all exclusions, 636 participants were available for analysis in our study, representing 141,363,503 people aged 20 to 59 yr in the United States from 1999–2000. Chi-squared tests were used to evaluate differences in gender, age, race/ethnicity, education, family income, country of birth, and smoking status between VOC subsample participants included in our study population and those excluded for analysis (n = 215). No differences were detected at a significance level of .05. Statistical Methods Graphs of VOC measurements versus VOC subsample weights were examined to identify potentially influential observations (National Center for Health Statistics [NCHS], 2002); however, none were detected. Inspection of the frequency distributions of the exposure concentrations for each contaminant suggested they were approximately log-normal. Thus, data were log-transformed in all regression analyses. Initially, the natural logarithm of the VOC exposure measurement was separately regressed in a weighted analysis on each potential independent variable. Consistent with the recommendation to select a significance level greater than .05 for identifying candidate variables for multivariable analysis (Hosmer & Lemeshow, 2000; Rothman et al., 2008), all covariates with a p value of less than .2 for the adjusted F-statistic with the Satterthwaite correction for degrees of freedom in the univariate regressions were included in the initial multiple linear regression models (described later). In all statistical analyses the specific sample weights, which account for different probabilities of selection (adjusted for poststratification and nonresponse), were applied to the data collected on the subsample of the NHANES population for whom VOC exposures were measured. Variances were adjusted using the stratum and primary sampling unit (PSU) variables. All statistical analyses were performed using SAS System Software (SAS System Version 9.2, Cary, NC) and SAS-callable SUDAAN System Software (SUDAAN 10.0, RTI International, Research Triangle Park, NC). Variables that were evaluated include responses to the VOC Exposure Monitoring Questionnaire, as well as responses in the Smoking Section of the Sample Person (SP) Questionnaire, the Housing Characteristics Section of the Family Question-
905
naire, and the Demographics file. For all questionnaire data, “Refused” or “Don’t know” responses were coded as missing. In NHANES, country of birth was reported as “USA,” “Mexico,” or “Elsewhere.” Highest level of education was ascertained as “Less than High School,” “High School,” or “More than High School.” Race/ethnicity was coded in the current study as “Non-Hispanic Whites,” “Non-Hispanic Blacks,” “Hispanics” (including “Mexican-Americans” and “Other Hispanics”), and “Other races.” Income was collapsed into 5 categories ($0–24,999, $25,000–44,999, $45,000–64,999, $65,000+, and missing) and age into 4 categories (20–29, 30–39, 40–49, and 50–59 yr). Measurements of serum cotinine from the NHANES Laboratory 6 data file were dichotomized to classify participants as smokers (>15 ng/ml) versus nonsmokers (≤15 ng/ml) (NCI 1999). Study participants with a response “Something else” to “Type of kitchen stove used” were assigned a missing value due to the small number of observations within this category (n = 19). From the Housing Characteristics Section of the Family Questionnaire, one participant with “Other” as a response to “Source of tap water in the home” was assigned a missing value, as were 7 participants with responses “Something else” (n = 5) or “Dormitory” (n = 2) for their type of home and 16 participants with a response “Other arrangement” for home ownership. Description of the street that a participant lived on was re-coded to reflect light (rural or country road, dead end residential street, and through residential street) versus heavy (commercial street and major highway) traffic. The proportion of time spent at home, work, or school or outdoors while wearing the OVM was calculated and categorized into tertiles. Summary statistics for each independent variable were examined to assess the extent of missing data, as well as to evaluate whether there was sufficient sample size in each of the response levels. Almost all of the potential covariates among the study sample had less than 1% of their values missing with the exception of total family income (8.3%), type of stove in the home (4.6%), and serum cotinine levels (3.4%). One variable (breathing fumes from or using mothballs, moth flakes, or moth crystals) was omitted because of potential problems with sparse cells (less than 4% of study participants indicated a positive response). The dependence among pairs of potential determinants of exposure was evaluated using chi-squared tests at a significance level of .05. Due to significant dependence of variables related to active and passive smoking, serum cotinine levels (dichotomized using a cut point of 15 ng/ml to classify smokers and non-smokers) (NCI, 1999) were used in favor of other variables. Gender, age, highest level of education attained, income level, and race/ethnicity were highly correlated with several variables from the VOC questionnaire and thus could not be evaluated simultaneously in a regression model with the data collected on home characteristics and activities during the monitoring period. For these variables, the geometric means (GM) and 95% confidence intervals (CI) were calculated using
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
906
E. SYMANSKI ET AL.
the survey sampling weights and accounting for the clustered sampling design. On a compound-specific basis, all other pairs of collinear variables that were also both significant in the univariate regression analyses were evaluated. For inclusion in the initial multiple regression model, (1) presence of an attached garage was selected rather than type of home (toluene, ethylbenzene, xylenes); (2) breathing fumes from or using paint was selected rather than breathing fumes from or using paint thinner, brush cleaner or furniture stripper (ethylbenzene); (3) pumping gas into a car or motor vehicle was selected rather than breathing fumes from or using gasoline (BTEX); (4) pumping gas into a car or motor vehicle was selected rather than breathing fumes from or using diesel (toluene, ethylbenzene, xylenes); (5) source of tap water was selected rather than use of water treatment devices (benzene); (6) using fingernail polish or polish remover was selected rather than using hairspray (benzene, ethylbenzene, xylenes); (7) using furniture polish was selected rather than using hairspray (benzene, ethylbenzene, xylenes); and (8) using disinfectants and degreasing cleaners was selected rather than using furniture polish (benzene). SUDAAN does not have the capacity to run model selection algorithms for building multiple regression models. Since it was not possible to manually fit all possible regression models given the rather large sets of determinants identified from the univariate analyses (e.g., for 15 potential determinants, there would be 215 possible regression models), an ad hoc all possible regression procedure was developed based upon the following steps: (1) Perform an “all possible weighted regression” in SAS; (2) select the top 20 models using adjusted R squared as the criterion for comparing regression models; (3) rerun the weighted regression models in SUDAAN accounting for the complex survey design and select the model with the highest R-squared value. However, the top 20 models identified in step 2 were nearly identical in terms of variable selection to the full model, with no meaningful difference in the adjusted R-squared values across models. Thus, backward elimination was used instead to develop our final models; this approach was selected because it has been suggested that a backward elimination approach is preferable to a forward approach because it allows for an evaluation of the joint effects of subsets of variables even when a smaller subset fails to be predictive (Mantel, 1970). Here, with the full model in place, the variable with the highest p value was removed “manually” (based on the results of the Satterthwaite adjusted F test), the model was rerun, and the process was repeated until a parsimonious model was obtained that was comprised of determinants with p values less than .05. Finally, to address issues associated with multiple comparisons, all p values of significant determinants in the final model were recalculated using a stepdown Bonferroni method. The determinant with the highest stepdown Bonferroni p value greater than .05 was removed from the model. In the case where more than one covariate had the same stepdown Bonferroni
p value, the one with highest raw p value was removed and the regression model rerun. This procedure was repeated until all stepdown Bonferroni p values were less than .05. RESULTS Table 1 provides the arithmetic means, geometric means, selected quantiles, and their 95% confidence intervals (CI) of the personal VOC exposure concentrations of the participants in our study (n = 636). Measurement data on BTEX were available for nearly all participants. Although less than 10% of the measurements were below the MDL for ethylbenzene, o-xylene, m,p-xylene, and toluene a considerably higher percent of participants (34%) had measurement data below the MDL for benzene. Based on the 50th percentile levels, personal exposures were lowest for o-xylene (2.37 μg/m3) and highest for toluene (17.23 μg/m3). The interquartile range (IQR) varied from 3.62 μg/m3 for o-xylene to 20.5 μg/m3 for toluene. Table 2 describes sociodemographic and lifestyle characteristics of the study population. Men and women were nearly equally represented in the subsample. More participants were aged 20 to 39 yr (57%) as compared to those 40 yr of age and older (43%). Non-Hispanic Whites (68%) comprised the largest ethnic/racial category. Of those reporting family income, 44% earned less than $35,000. A majority of individuals received more than a high school education (53%). Sixty-seven percent of participants were classified as nonsmokers based upon serum cotinine levels. In addition, presented in Table 2 are the weighted GM levels and 95% CI by age, gender, race/ethnicity, country of birth, smoking status, and educational and income levels. Significantly higher exposures to benzene, ethylbenzene, m,p-xylene, and o-xylene were detected among males. While Hispanics appeared to experience higher exposures to all compounds, statistically significant differences by race/ethnicity were detected only for benzene. Exposures to benzene were higher among smokers as compared to nonsmokers; significant differences between current smokers and nonsmokers for the other compounds were not detected. Based upon the results from our univariate regression analyses, 13 potential predictors for benzene satisfied our criteria for consideration in the multiple regression model, 10 for ethylbenzene, 8 for o-xylene, 7 for m,p-xylene, and 10 for toluene (results not shown). Table 3 reports on the final models from the multiple regression analyses for BTEX. Living in homes with attached garages resulted in higher personal exposures for all compounds. Having any windows open in the home during the monitoring period was associated with decreased exposures to benzene and toluene. Pumping gas into a car or other motor vehicle increased exposures to toluene, ethylbenzene, o-xylene, and m,p-xylene. There was more than a twofold increase in exposure to o-xylene and m,p-xylene for individuals who reported breathing fumes from or using paint thinner, brush
907
628
623
619
627
627
Benzene
Ethylbenzene
Toluene
o-Xylene
m,p-Xylene
4.28
7.45
6.39
6.96
34.23
% 5.25 (3.73–6.76) 9.37 (4.37–14.37) 39.17 (28.36–49.99) 7.46 (4.66–10.27) 22.24 (11.52–32.97)
(95% CI)
AM
Note. CI, confidence interval; MDL, method detection limit.
n
VOC
< MDL
3.21 (2.52–3.90) 2.93 (2.11–3.74) 17.52 (14.54–20.50) 2.79 (2.08–3.51) 7.29 (5.12–9.45)
(95% CI)
GM
0.88 (0.82–1.04) 0.52 (0.35–0.88) 2.90 (2.45–5.31) 0.55 (0.43–0.71) 1.00 (0.66–1.86)
5 1.44 (1.20–1.98) 1.34 (1.10–1.76) 9.27 (7.27–11.99) 1.25 (1.04–1.54) 3.31 (2.51–4.31)
25
2.85 (2.54–3.60) 2.57 (1.89–3.28) 17.23 (15.32–19.56) 2.37 (1.93–3.15) 6.59 (4.79–8.44)
50
75 5.83 (4.02–8.42) 5.18 (3.89–7.57) 29.79 (24.23–34.19) 4.87 (3.71–7.56) 14.33 (9.75–20.37)
Selected quantiles (95% CI)
TABLE 1 Weighted Arithmetic Means (AM), Geometric Means (GM), and Selected Quantiles of Personal VOC Exposures (μg/m3), VOC Subsample, NHANES 1999–2000 (n = 636 Adults)
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
18.11 (11.50–28.99) 25.42 (15.08–66.22) 96.61 (70.75–562.63) 26.66 (15.07–76.67) 76.27 (42.66–207.76)
95
908
27.2 30.0 24.8 18.0 48.5 51.5 15.1 68.3 12.0 4.6 83.0 4.6 12.4 19.6 27.0 53.4 23.7 20.7 26.7 22.3 6.6 67.3 32.7
285 351 225 260 124 19 446 118 72 197 160 279 187 130 150 108 61 447 162
Weighted %
183 171 152 130
n
18.05 (14.13–21.97) 17.61 (13.85–21.38) 16.69 (13.86–19.53) 16.58 (1.34–31.82)
4.20 (2.86–5.55)a 2.99 (2.36–3.61) 3.03 (1.88–4.19) 4.34 (1.08–7.59)
15.80 (10.51–21.10) 16.94 (11.71–22.17) 17.97 (13.99–21.96) 17.43 (12.39–22.47) 25.97 (13.33–38.60) 16.75 (13.78–19.73) 19.09 (13.64–24.54)
2.94 (2.39–3.48)a 3.90 (2.61–5.20)
16.08 (11.88–20.28) 18.01 (14.78–21.25) 17.83 (13.76–21.89)
3.45 (2.38–4.52) 3.28 (2.11–4.45) 3.13 (2.30–3.96) 2.71 (2.17–3.24) 4.48 (1.49–7.47)
3.89 (2.93–4.86) 3.15 (2.30–4.00) 3.01 (2.28–3.74)
17.07 (14.10–20.04) 20.95 (12.53–29.37) 19.47 (9.05–29.88)
18.99 (14.79–23.19) 16.24 (13.57–18.90)
3.64 (2.62–4.66)a 2.85 (2.33–3.36)
3.05 (2.47–3.62) 4.77 (3.04–6.49) 3.88 (1.61–6.15)
17.40 (15.29–19.51) 16.16 (12.57–19.75) 17.83 (13.65–22.01) 19.82 (12.70–26.93)
Toluene
3.03 (2.42–3.65) 3.46 (2.55–4.36) 2.99 (2.15–3.83) 3.39 (2.16–4.63)
Benzene
Satherthwaite adjusted F statistic from the univariate regression analyses, p < .05.
a
Age (yr) 20–29 30–39 40–49 50–59 Gender Male Female Race/ethnicity Hispanic Non-Hispanic White Non-Hispanic Black Other Country of birth United States Mexico Other Education Less than high school High school More than high school Family income $0–19,999 $20,000–34,999 $35,000–64,999 $65,000 and over Missing Serum cotinine level ≤15 ng/ml >15 ng/ml
Characteristic
2.63 (1.93–3.32) 3.60 (2.00–5.20)
2.96 (1.66–4.26) 2.49 (1.58–3.41) 2.90 (2.24–3.56) 2.88 (1.75–4.01) 5.10 (1.63–8.57)
3.34 (1.74–4.95) 3.10 (2.52–3.68) 2.71 (1.87–3.55)
2.84 (2.03–3.65) 3.15 (2.08–4.21) 3.49 (1.29–5.68)
3.69 (2.10–5.29) 2.91 (1.98–3.83) 2.49 (1.84–3.14) 2.27 (0.10–4.44)
3.46 (2.65–4.26)a 2.49 (1.68–3.31)
2.55 (2.08–3.02) 3.16 (1.72–4.59) 3.02 (2.21–3.82) 3.03 (1.76–4.31)
Ethylbenzene
6.83 (4.95–8.71) 8.42 (4.24–12.61)
7.48 (3.69–11.27) 6.00 (3.51–8.49) 6.84 (5.61–8.07) 7.56 (4.08–11.04) 14.10 (3.05–25.15)
8.28 (4.14–12.42) 7.37 (5.74–9.01) 6.91 (4.53–9.29)
6.90 (4.84–8.95) 8.50 (5.30–11.69) 9.95 (2.80–17.09)
10.02 (5.70–14.33) 7.05 (4.62–9.47) 6.36 (4.48–8.24) 6.08 (−0.56–12.72)
8.78 (6.43–11.13)a 6.11 (4.04–8.17)
6.44 (5.17–7.70) 7.70 (4.43–10.98) 7.41 (5.05–9.77) 7.80 (4.10–11.50)
m,p-Xylene
TABLE 2 Weighted Geometric Mean Levels of BTEX for U.S. Adults for Various Demographic Subgroups, NHANES 1999–2000, VOC Subsample (n = 636 Adults)
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
2.66 (1.99–3.33) 3.08 (1.86–4.30)
2.73 (1.61–3.85) 2.53 (1.51–3.54) 2.59 (2.13–3.05) 2.94 (1.76–4.11) 4.78 (1.49–8.07)
2.97 (1.67–4.26) 2.96 (2.37–3.54) 2.65 (1.88–3.43)
2.69 (2.00–3.38) 3.10 (2.02–4.18) 3.44 (1.03–5.85)
3.34 (2.02–4.66) 2.82 (1.99–3.65) 2.32 (1.80–2.83) 2.15 (−0.15–4.44)
3.36 (2.49–4.24)a 2.34 (1.70–2.98)
2.56 (2.11–3.02) 2.90 (1.89–3.90) 2.74 (1.97–3.50) 3.07 (1.79–4.35)
o-Xylene
VOC EXPOSURE IN THE U.S. ADULT POPULATION
909
TABLE 3 Final Regression Models of Key Determinants of Personal Exposure to BTEX
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
Determinant (p valuea) Benzene Home with attached garage vs. no attached garage (p = .0067) Any windows open in the home vs. no windows open (p = .0140) Cotinine level >15 ng/ml vs. ≤15 ng/ml (p = .0185) Toluene Home with attached garage vs. no attached garage (p = .0051) Gas stove vs. electric stove (p = .0205) Any windows open in the home vs. no windows open (p = .0011) Pumping gas vs. not pumping gas (p = .0013) Ethylbenzene Home with attached garage vs. no attached garage (p = .0207) Pumping gas vs. not pumping gas (p = .0072) o-Xylene Home with attached garage vs. no attached garage (p = .0063) Pumping gas vs. not pumping gas (p = .0194) Breathing fumes vs. not breathing fumes from furniture polish (p = .0088) Breathing fumes vs. not breathing fumes from paint thinner, brush cleaner or stripper (p = .0010) m,p-Xylene Home with attached garage vs. no attached garage (p = .0030) Pumping gas vs. not pumping gas into a car or motor vehicle (p = .0393) Breathing fumes vs. not breathing fumes from furniture polish (p = .0039) Breathing fumes vs. not breathing fumes from paint thinner, brush cleaner or stripper (p = .0058) a b
Regression coefficient (95% CI)
Multiplicative factorb (95% CI)
0.329 (0.107, 0.551) −0.420 (−0.741, −0.099) 0.309 (0.060, 0.557)
1.39 (1.11, 1.73) 0.66 (0.48, 0.91) 1.36 (1.06, 1.74)
0.384 (0.136, 0.633) −0.288 (−0.524, −0.052) −0.420 (−0.640, −0.202) 0.226 (0.105, 0.347)
1.46 (1.15, 1.88) 0.75 (0.59, 0.95) 0.66 (0.53, 0.82) 1.25 (1.11, 1.41)
0.325 (0.058, 0.592) 0.343 (0.109, 0.577)
1.38 (1.06, 1.80) 1.41 (1.12, 1.78)
0.378 (0.126, 0.631) 0.323 (0.061, 0.586) −0.379 (−0.646, −0.112)
1.46 (1.13, 1.88) 1.38 (1.06, 1.80) 0.68 (0.52, 0.89)
0.818 (0.397, 1.240)
2.27 (1.49, 3.46)
0.433 (0.174, 0.692) 0.318 (0.018, 0.617)
1.54 (1.19, 2.0) 1.37 (1.02, 1.85)
−0.438 (−0.711, −0.166)
0.65 (0.49, 0.85)
0.769 (0.262, 1.275)
2.16 (1.30, 3.58)
The p value associated with the Satherwaite adjusted F-statistic in the final model. Fold range change in geometric mean levels (μg/m3).
cleaner, or furniture stripper. Individuals who used a gas rather than electric stove during the monitoring period experienced lower toluene exposures. Breathing fumes from or using furniture polish decreased personal exposures to o-xylene and m,p-xylene. While there were slight differences in the magnitude of regression estimates, the same factors were identified as predictors of o-xylene and m,p-xylene personal exposures. DISCUSSION This study afforded a unique opportunity to use information collected as part of the National Health and Nutrition Examination Survey to identify influential determinants of exposures to selected VOC in a population-based sample of adults aged 20 to 59 yr in the United States. GM levels of VOC exposures were examined on the basis of sociodemographic characteristics. No clear patterns emerged on the basis of age, education, or income level, although study participants who did not provide
information on family income appeared to have higher BTEX exposures as compared to any other group. Results comparing GM levels suggest that personal VOC exposures were significantly higher among males for all compounds except toluene. The U.S. EPA TEAM studies reported mixed results for benzene and o-xylene exposures between males and females depending upon the region studied in the United States, the year of data collection, and the time of sample collection (i.e., overnight or during the day) (Wallace, 1987). Further, increased exposure to benzene (but not toluene, ethylbenzene, or xylenes) was reported among males in a recent study of nonsmoker volunteers from four Australian cities (Hinwood et al., 2007). In contrast, it is also interesting to note that in a multivariate analysis that controlled for several covariates including air levels and body mass index using the NHANES VOC subsample, Lin et al. (2008) reported higher blood levels of BTEX in women as compared to men, although such differences were not significant.
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
910
E. SYMANSKI ET AL.
Benzene exposures were higher among Hispanics compared to other ethnic/racial groups. While Arif and Shah (2007) recently reported differences in VOC exposures by race/ethnicity using a subset of VOC study participants, their geometric mean values were consistently only about one-third as large as those reported in this investigation among racial/ethnic groups or overall (across groups) in another study that also relied on the same database (Jia et al., 2008). While it is expected that disadvantaged populations may be exposed disproportionately to environmental contaminants, relatively little has been reported on differences in VOC exposures across such subpopulations. The earliest evaluation looking at differences in VOC exposures between ethnic/racial groups was made in the California TEAM studies in which equivocal results were reported between Hispanics and non-Hispanics for selected compounds depending on where the participants resided and on whether the samples were collected during the day or night (Wallace et al., 1988). About a decade later the NHEXAS Region 5 Pilot Study reported higher (albeit nonsignificant) benzene exposure levels for Hispanic Whites and other minorities as compared to nonminorities (Pellizzari et al., 1999). Our study confirms previous findings from studies that were more regional in scope or that relied on convenience samples, but allows for broader inferences to the general U.S. adult population regarding effects of specific behaviors and activities that contribute to elevated VOC exposures. For example, individuals living in homes with an attached garage experienced increases in exposures to BTEX, which is consistent with studies that showed that vehicle emissions infiltrate from an attached garage indoors into the home (Thomas et al., 1993; Graham et al., 2004; Batterman et al., 2007). Exposures to benzene and toluene, both of which have known indoor VOC sources, were lower for individuals whose homes had windows open during the monitoring period. An analysis of personal exposures to 14 VOC among 70 nonsmoking adults in Minneapolis-St. Paul (Sexton et al., 2007) also suggested that being in or near a garage significantly increased exposure to BTEX, and opening windows at home ≥6 h/d decreased VOC exposure to most compounds. It was also found that pumping gas into a car or motor vehicle increased exposures to toluene, ethylbenzene, and xylene but was not an important determinant of benzene exposure. In the aforementioned study in Minnesota (Sexton et al., 2007), factors related to exposure to gasoline were unrelated to increases in personal BTEX exposure. Among the five VOC exposures examined in this study, geometric mean levels were higher for current smokers as compared to nonsmokers for benzene alone. This is in agreement with the results of the NHANES VOC analysis reported by Lin et al. (2008). Likewise, in the multivariate analysis, current smoking was an important predictor of population-based VOC exposure for benzene only, which had an approximately equal effect on increasing personal exposure as living in a home with an attached garage. While tobacco smoke remains a predominant source of exposure to benzene among smokers
(Edwards et al., 2001), the importance of smoking as a key determinant of population-based exposure to BTEX has likely diminished in the United States as the prevalence of smoking has declined over the past 25 yr (Gregg et al., 2005). Our results also indicated more than a twofold difference in exposures to xylenes due to contact with paint thinners, brush cleaner, or furniture stripper. While this result was expected given the chemical constituents of these products, our regression results also produced findings of lower exposures to xylenes associated with the use of furniture polish, as well as lower exposures to toluene with the use of a gas rather than electric stove. To explore the possibility of influential observations on these unanticipated findings, sensitivity analyses were conducted by calculating the difference between the observed and predicted values from the final models and rerunning the regression analyses excluding the upper and lower tails (i.e., omitting values below the 1st and above the 99th percentiles of the distributions of the residuals). However, the associations between xylenes (both analytes) and furniture polish, as well as between toluene and use of a gas or electric stove, remained significant and of a similar magnitude and direction as compared to results from the original final models. It is likely, therefore, that one or more unmeasured factors are confounding the observed associations that were detected. Despite the obvious advantages of NHANES with regard to national representativeness, a relatively small number of persons participated in the VOC subsample. Thus, the capacity of the data to allow for the simultaneous evaluation of multiple predictors is limited (Riederer et al., 2009) because the number of parameters that can be estimated is restricted by the number of primary sampling units and primary clusters. Analysis of subdomains within any single wave of NHANES and thus with the 1999–2000 NHANES VOC subsample as well is also constrained by sample size (CDC, 2006). As such, our evaluation of differences among demographic subgroups was limited, as was our regression analyses that only examined main effects. Another possible limitation relates to differences among study participants in terms of the duration of the monitoring period as well as the length of time that the monitor was worn, although those effects were minimized in our analyses by excluding data on individuals who wore their badges less than 75% of the time. Moreover, the monitoring period among the study participants who met the criteria for inclusion in our study ranged from 43 to 76 h (92% wore the monitor the entire period) and thus these differences in averaging times were not likely to have adversely affected our results. Findings were also restricted by the nature of the questionnaire data that were collected on factors potentially related to VOC exposures in that the majority of the responses were dichotomous (“Yes” versus “No”), which may have introduced some error (i.e., residual confounding) in the effects of those variables included in the final models. Because of multicollinearity between sociodemographic variables and the VOC questionnaire data, it was not possible to evaluate the effects of demographic factors
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
VOC EXPOSURE IN THE U.S. ADULT POPULATION
and household or individual behaviors simultaneously in our regression models. Finally, building regression models was problematic because existing statistical software that accounts for a complex survey design (like SUDAAN) does not have the capacity to run model selection algorithms. An attempt was made to apply two different strategies for model building to facilitate comparisons in the results that may have been influenced by the selection of the algorithm used to produce the final models. However, the ad hoc all possible regression procedure that was developed did not distinguish among the regression models that were evaluated and therefore such a comparison could not be made. While the backward elimination approach that was applied in the current investigation has been used in recent studies of NHANES data (Calafat et al., 2008a, 2008b), other approaches have been applied as well. For example, in a study to examine predictors of chloroform exposure from the NHANES VOC substudy (Riederer et al., 2009), a forward approach was applied by adding variables one at a time, with the order determined randomly. Since our goal was to identify the most important predictors of VOC exposure among adults in the United States, a change in estimate approach was not applied since focus was not restricted to a single determinant (Greenland, 1989). Statistical techniques were also applied for dealing with potential issues associated with multiple comparisons, which have not been applied in previous regression analyses of NHANES data. In conclusion, this is the first study to examine risk factors for higher BTEX exposures among a nationally representative population-based study of U.S. adults. Significant differences in exposures were found across selected demographic groups and on the basis of behavioral and household characteristics, which were generally consistent with findings reported in studies that relied on convenience samples or targeted and regionally focused population samples. These results update the earlier U.S. EPA TEAM studies carried out nearly 30 yr ago and should provide a baseline for examining determinants of VOC exposure in the future. Further studies are needed to verify our findings regarding key determinants that influence population-based VOC exposures in the United States and assess whether such determinants vary by racial/ethnic group, which may be facilitated if and when the complete VOC subsample data are released. REFERENCES Adgate, J. L., Church, T. R., Ryan, A. D., Ramachandran, G., Fredrickson, A. L., Stock, T. H., Morandi, M. T., and Sexton, K. 2004. Outdoor, indoor, and personal exposure to VOCs in children. Environ. Health Perspect. 112: 1386–1392. Arif, A. A., and Shah, S. M. 2007. Association between personal exposure to volatile organic compounds and asthma among US adult population. Int. Arch. Occup. Environ. Health 80:711–719. Batterman, S., Jia, C., and Hatzivasilis, G. 2007. Migration of volatile organic compounds from attached garages to residences: A major exposure source. Environ Res 104:224–240. Calafat, A. M., Ye, X., Wong, L. Y., Reidy, J. A., and Needham, L. L. 2008a. Exposure of the U.S. population to bisphenol A and 4-tertiary-octylphenol: 2003–2004. Environ. Health Perspect. 116:39–44.
911
Calafat, A. M., Ye, X., Wong, L. Y., Reidy, J. A., and Needham, L. L. 2008b. Urinary concentrations of triclosan in the U.S. population: 2003–2004. Environ. Health Perspect. 116:303–307. Centers for Disease Control and Prevention. 2001. NHANES laboratory procedures manual, Appendix A, Protocol for Volatile Organic Compounds (VOC) Study. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention [accessed 1/10/2008 ]. www.cdc.gov/nchs/ data/nhanes/LAB-Appendices.pdf. Centers for Disease Control and Prevention. 2005. NHANES 1999–2000 data documentation, Lab 21—Volatile organic compounds (VOC). U.S. Department of Health and Human Services, Centers for Disease Control and Prevention [accessed 1/10/2008]. www.cdc.gov/nchs/data/nhanes/frequency/ lab21_doc.pdf. Centers for Disease Control and Prevention. 2006. NHANES analtyic and reporting guidelines. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention [accessed 2/10/2009]. http:// www.cdc.gov/nchs/data/nhanes/nhanes_03_04/ nhanes_analytic_guidelines_dec_2005.pdf. Chung, C. W., Morandi, M. T., Stock, T. H., and Afshar, M. 1999. Evaluation of a passive sampler for volatile organic compounds at ppb concentrations, varying temperatures and humidities with 24-hour exposures. II. Sampler performance. Environ. Sci. Technol. 33:3666–3671. Clayton, C. A., Pellizzari, E. D., Whitmore, R. W., Perritt, R. L., and Quackenboss, J. J. 1999. National Human Exposure Assessment Survey (NHEXAS): Distributions and associations of lead, arsenic and volatile organic compounds in EPA region 5. J. Expos. Anal. Environ. Epidemiol. 9:381–392. Edwards, R. D., Jurvelin, J., Saarela, K., and Jantunen, M. 2001. VOC concentrations measured in personal samples and residential indoor, outdoor and workplace microenvironments in EXPOLIS-Helsinki, Finland. Atmos. Environ. 35:4531–4543. Graham, L. A., Noseworthy, L., Fugler, D., O’Leary, K., Karman, D., and Grande, C. 2004. Contribution of vehicle emissions from an attached garage to residential indoor air pollution levels. J. Air Waste Manage. Assoc. 54:563–584. Greenland, S. 1989. Modeling and variable selection in epidemiologic analysis. Am. J. Public Health 79:340–349. Gregg, E. W., Cheng, Y. J., Cadwell, B. L., Imperatore, G., Williams, D. E., Flegal, K. M., Narayan, K. M., and Williamson, D. F. 2005. Secular trends in cardiovascular disease risk factors according to body mass index in US adults. [erratum appears in J. Am. Med. Assoc. 2005 Jul 13;294(2):182]. J. Am. Med. Assoc. 293:1868–1874. Hinwood, A. L., Rodriguez, C., Runnion, T., Farrar, D., Murray, F., Horton, A., Glass, D., Sheppeard, V., Edwards, J. W., Denison, L., Whitworth, T., Eiser, C., Bulsara, M., Gillett, R. W., Powell, J., Lawson, S., Weeks, I., and Galbally, I. 2007. Risk factors for increased BTEX exposure in four Australian cities. Chemosphere 66:533–541 Hoffmann, K., Krause, C., Seifert, B., and Ullrich, D. 2000. The German Environmental Survey 1990/92 (GerES II): Sources of personal exposure to volatile organic compounds. J. Expos. Anal. Environ. Epidemiol. 10:115–125. Hornung, R. W., and Reed, D. R. 1990. Estimation of average concentration in the presence of nondetectable values. Appl. Occup. Environ. Hyg. 5:46–51. Hosmer, D. W., and Lemeshow, S. 2000. Applied logistic regression, 2nd ed. New York: John Wiley & Sons. Jia, C., D’Souza, J., and Batterman, S. 2008. Distributions of personal VOC exposures: A population-based analysis. Environ. Int. 34:922–931. Kim, Y. M., Harrad, S., and Harrison, R. M. 2002. Levels and sources of personal inhalation exposure to volatile organic compounds. Environ. Sci. Technol. 36:5405–5410. Kinney, P. L., Chillrud, S. N., Ramstrom, S., Ross, J., and Spengler, J. D. 2002. Exposures to multiple air toxics in New York City. Environ. Health Perspect. 110(Suppl. 4):539–546. Lin, Y. S., Egeghy, P. P., and Rappaport, S. M. 2008. Relationships between levels of volatile organic compounds in air and blood from the general population. J. Expos. Sci. Environ. Epidemiol. 18:421–429. Mantel, N. 1970. Why stepdown procedures in variable selection. Technometrics 12:621–625.
Downloaded By: [Francis A Countway] At: 15:00 9 October 2009
912
E. SYMANSKI ET AL.
National Center for Health Statistics. 2008. NHANES 1999–2000 addendum to the NHANES III analytic guidelines. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 8/30/ 02 [accessed 11/01/2008]. http://www.cdc.gov/nchs/data/nhanes/guide lines1.pdf. NCI. 1999. Monograph 10: Health effects of exposure to environmental tobacco smoke [accessed 5/30/2008]. http://cancercontrol.cancer.gov/tcrb/ monographs/10/. Payne-Sturges, D. C., Burke, T. A., Breysse, P., Diener-West, M., and Buckley, T. J. 2004. Personal exposure meets risk assessment: A comparison of measured and modeled exposures and risks in an urban community. Environ. Health Perspect. 112:589–598. Pellizzari, E. D., Perritt, R. L., and Clayton, C. A. 1999. National human exposure assessment survey (NHEXAS): Exploratory survey of exposure among population subgroups in EPA Region V. J. Expos. Anal. Environ. Epidemiol. 9:49–55. Phillips, M. L., Esmen, N. A., Hall, T. A., and Lynch, R. 2005. Determinants of exposure to volatile organic compounds in four Oklahoma cities. J. Expos. Anal. Environ. Epidemiol. 15:35–46. Riederer, A. M., Bartell, S. M., and Ryan, P. B. 2009. Predictors of personal air concentrations of chloroform among US adults in NHANES 1999–2000. J. Expos. Sci. Environ. Epidemiol. 19:248–259. Rothman, K. J., Greenland, S., and Lash, T. L. 2008. Modern epidemiology, 3rd ed. Philadelphia: Lippincott Williams & Wilkins. Sax, S. N., Bennett, D. H., Chillrud, S. N., Ross, J., Kinney, P. L., and Spengler, J. D. 2006. A cancer risk assessment of inner-city teenagers living in New York City and Los Angeles. Environ. Health Perspect. 114:1558–1566. Serrano-Trespalacios, P. I., Ryan, L., and Spengler, J. D. 2004. Ambient, indoor and personal exposure relationships of volatile organic compounds in Mexico City Metropolitan Area. J. Expos. Anal. Environ. Epidemiol. 14(Suppl. 1):S118–S132. Sexton, K., Adgate, J. L., Ramachandran, G., Pratt, G. C., Mongin, S. J., Stock, T. H., and Morandi, M. T. 2004. Comparison of personal, indoor, and outdoor exposures to hazardous air pollutants in three urban communities. Environ. Sci. Technol. 38:423–430.
Sexton, K., Mongin, S. J., Adgate, J. L., Pratt, G. C., Ramachandran, G., Stock, T. H., and Morandi, M. T. 2007. Estimating volatile organic compound concentrations in selected microenvironments using time–activity and personal exposure data. J. Toxicol. Environ. Health A 70:465–476. Thomas, K. W., Pellizzari, E. D., Clayton, C. A., Perritt, R. L., Dietz, R. N., Goodrich, R. W.,Nelson, W. C., and Wallace, L. A. 1993. Temporal variability of benzene exposures for residents in several New Jersey homes with attached garages or tobacco smoke. J. Expos. Anal. Environ. Epidemiol. 3:49–73. U.S. Environmental Protection Agency. 1990. Clean Air Act amendments of 1990, Title 1, Air pollution prevention and control, Part A, Air quality and emissions limitations, Section 112, National emission standards for hazardous air pollutants. U.S. Environmental Protection Agency [accessed August 7, 2008 ]. http://www.epa.gov/air/caa/caa112.txt. Wallace, L. A., Pellizzari, E. D., Hartwell, T. D., Sparacino, C., Whitmore R., Sheldon, L., Zelon, H., and Perritt, R. 1987. The TEAM (Total Exposure Assessment Methodology) Study: Personal exposures to toxic substances in air, drinking water, and breath of 400 residents of New Jersey, North Carolina, and North Dakota. Environ. Res. 43:290–307. Wallace, L. A. 1987. The total exposure assessment methodology (TEAM) study: Summary and analysis: Volume 1. Washington, DC: Office of Research and Development, U.S. Environmental Protection Agency. Wallace, L. A., Pellizzari, E., Hartwell, T. D., Whitmore, R., Zelon, H., Perritt, R., and Sheldon, L. 1988. The California TEAM study: Breath concentrations and personal exposures to 26 volatile compounds in air and drinking water of 188 residents of Los Angeles, Antioch, and Pittsburg, CA. Atmos. Environ. 22:2141–2163. Wallace, L. A., Pellizzari, E., and Hartwell, T. D. 1985. Personal exposures, indoor–outdoor relationship and breath levels of toxic air pollutants measured for 355 persons in New Jersey. Atmos. Environ. 19:1651–1661. Weisel, C. P., Zhang, J., Turpin, B. J., Morandi, M. T., Colome, S., Stock, T. H., Spektor, D. M., Korn, L., Winer, A., Alimokhtari, S., Kwon, J., Mohan, K., Harrington, R., Giovanetti, R., Cui, W., Afshar, M., Maberti, S., and Shendell, D. 2005. Relationship of indoor, outdoor and personal air (RIOPA) study: Study design, methods and quality assurance/control results. J. Expos. Anal. Environ. Epidemiol. 15:123–137.
Atmospheric Environment 43 (2009) 2296–2302
Contents lists available at ScienceDirect
Atmospheric Environment j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / a t m o s e n v
Characterizing relationships between personal exposures to VOCs and socioeconomic, demographic, behavioral variables Sheng-Wei Wang
a,*,1,
Mohammed A. Majeed
a,b ,
Pei-Ling Chu c, Hui-Chih Lin
d
a
University of Medicine and Dentistry of New Jersey (UMDNJ), Robert Wood Johnson Medical School, NJ, USA Department of Natural Resources & Environmental Control (DNREC), State of Delaware, USA c Novo Nordisk Inc., Princeton, NJ, USA d Department of Marketing & Distribution Management, The Overseas Chinese Institute of Technology, Taiwan b
A R T I C L E
I N F O
Article history: Received 8 September 2008 Received in revised form 21 January 2009 Accepted 25 January 2009 Keywords: Volatile organic compounds Personal exposures Time-activity patterns Socio-demographic factors NHANES
A B S T R A C T
Socioeconomic and demographic factors have been found to significantly affect time-activity patterns in population cohorts that can subsequently influence personal exposures to air pollutants. This study investigates relationships between personal exposures to eight VOCs (benzene, toluene, ethylbenzene, o-xylene, m-,p-xylene, chloroform, 1,4-dichlorobenzene, and tetrachloroethene) and socioeconomic, demographic, time-activity pattern factors using data collected from the 1999–2000 National Health and Nutrition Examination Survey (NHANES) VOC study. Socio-demographic factors (such as race/ethnicity and family income) were generally found to significantly influence personal exposures to the three chlorinated compounds. This was mainly due to the associations paired by race/ethnicity and urban residence, race/ethnicity and use of air freshener in car, family income and use of dry-cleaner, which can in turn affect exposures to chloroform, 1,4-dichlorobenzene, and tetrachloroethene, respectively. For BTEX, the traffic-related compounds, housing characteristics (leaving home windows open and having an attached garage) and personal activities related to the uses of fuels or solvent-related products played more significant roles in influencing exposures. Significant differences in BTEX exposures were also commonly found in relation to gender, due to associated significant differences in time spent at work/ school and outdoors. The coupling of Classification and Regression Tree (CART) and Bootstrap Aggregating (Bagging) techniques were used as effective tools for characterizing robust sets of significant VOC exposure factors presented above, which conventional statistical approaches could not accomplish. Identification of these significant VOC exposure factors can be used to generate hypotheses for future investigations about possible significant VOC exposure sources and pathways in the general U.S. population. 2009 Elsevier Ltd. All rights reserved.
1. Introduction Volatile Organic Compounds (VOCs) are common air pollutants that can be found in both indoor and outdoor environments. There are numerous sources of VOCs including gasoline, solvents, paints, and consumer products such as air fresheners, cleaning supplies, dry-cleaned clothing, building or furnishing materials, and so on (USEPA, 2007). In the literature, several VOC exposure monitoring studies have reported that personal and indoor air concentrations of VOCs are higher than outdoor ones, and that the factors of indoor
* Corresponding author. IEH, 7F, No. 17, Xuzhou Rd., Taipei 100, Taiwan. Tel.: 886-2-33668107; fax: 886-2-33668114. E-mail addresses:
[email protected],
[email protected] (S.-W. Wang). 1 Institute of Environmental Health, National Taiwan University, Taipei, Taiwan.
sources and personal activity can contribute significantly to personal exposures (Adgate et al., 2004; Serrano-Trespalacios et al., 2004; Sexton et al., 2004). Further, socioeconomic and demographic factors have been found to significantly affect time-activity patterns in population cohorts (McCurdy and Graham, 2003). It is important to know if significant different time-activity patterns defined by socioeconomic and demographic attributes also correlate with significant different VOC exposures. Edwards et al. (2006) reported the relationships between VOC exposures and socio-demographic factors, time-activity patterns in the European exposure study, EXPOLIS (Jurvelin et al., 2001). Sexton et al. (2007) and Liu et al. (2007) reported the relationships between VOC exposures and timeactivity patterns for selected adult populations in different urban areas of the U.S. However, besides time-activity patterns, the impacts of socioeconomic and demographic factors on personal
1352-2310/$ – see front matter 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2009.01.032
Wang et al: Reprinted from Atmospheric Environment, 43, Wang SW, MA Majeed, PL Chu, HC Lin, “Characterizing Relationships between Personal Exposures to VOCs and Socioeconomic, Demographic, Behavioral Variables,” 2296-2302, 2009, with permission from Elsevier.
S.-W. Wang et al. / Atmospheric Environment 43 (2009) 2296–2302
exposures to VOCs have not been adequately evaluated for the general U.S. population. The 1999–2000 National Health and Nutrition Examination Survey (NHANES) VOC project dataset (CDC, 2006a) provides an excellent and unique data source to correlate personal exposures to VOCs with socioeconomic, demographic, housing, and time-activity factors for the general U.S. population. The objectives of the current study were to (1) examine the relationships between VOC exposures and socio-demographic, lifestyle (i.e. housing and time-activity) variables, and (2) to characterize significant VOC exposure factors among these variables for a large population-based sample of the general U.S. population by analyzing the 1999–2000 NHANES VOC data. 2. Materials and methods 2.1. Data source The aims of the 1999–2000 NHANES VOC study were to characterize exposures to VOCs in the general U.S. population and determine predictors of exposure. This was the first time that NHANES included personal exposure measurements for VOCs. Participants were a representative sub-sample of NHANES subjects between the ages of 20 and 59 years. Personal air measurements were available for ten VOCs: benzene, chloroform, 1,4-dichlorobenzene (PDB), ethylbenzene, methyl tertiary-butyl ether (MTBE), tetrachloroethene (PERC), toluene, trichloroethylene (TCE), oxylene, and m-,p-xylene. Information about individual demographic, socioeconomic status, residences, as well as time and activity data for the exposure period, were also available for this population subset. The time and activity data collected via the special designed questionnaire can help identify possible sources of exposures and characterize personal activities that might contribute to exposure. The 1999–2000 NHANES study uses a stratified, multistage probability sample of the non-institutionalized US civilian population. Detailed information about the study design and operation of NHANES can be found in the analytical and reporting guidelines of the NHANES (CDC, 2006b). Participants of the 1999–2000 NHANES VOC project were asked to wear passive personal monitors (3 M Organic Vapor Monitors) for a period of 48–72 h for measuring personal exposures to ten VOCs (CDC, 2006a). On their return, a short exposure questionnaire was administered to participants to assess personal activities and exposures related to VOC measurements. The collected personal air samples were analyzed via GC-MS. Table 1 summarizes the numbers of available Table 1 Overview of personal air measurements in the NHANES 1999–2000 VOC dataset for benzene, chloroform, ethylbenzene, tetrachloroethene, toluene, trichloroethene, oxylene, m,p-xylene, 1,4-dichlorobenzene, and methyl tert-butyl ether (MTBE). VOC
Na
Percentage of measurements at or above limit of detectionb
Geometric mean (mg m3)
Geometric standard deviation (ug m3)
benzene chloroform ethylbenzene tetrachloroethene toluene trichloroethylene o-xylene m,p-xylene 1,4-dichlorobenzene MTBE
647 651 642 642 638 644 646 646 644 644
77.43 86.02 97.51 71.18 94.98 30.75 94.89 97.06 77.64 36.18
1.26 0.76 2.50 0.32 13.96 0.03 2.12 6.15 1.61 0.11
10.61 9.31 4.27 19.11 5.04 16.47 5.44 4.91 24.33 23.91
a
N: number of total available measurements. If the measurement was below the limit of detection, the concentration was reported as the limit of detection divided by the square root of 2. b
2297
measurements of the ten VOCs, percentages of measurements at or above limits of detection (LODs), as well as their geometric means and geometric standard deviations. TCE and MTBE were excluded from the data analysis, since less than 40% of the available measurements were above the respective LODs for both chemicals. For the remaining eight VOCs, if the measurement was below the LOD, the concentration reported as the LOD divided by the square root of 2 was used for data analysis. Socioeconomic and demographic attributes of the participants were extracted from the full survey data of 1999–2000 NHANES including: age, gender, education, race/ethnicity, and poverty income ratio (i.e. ratio of family income to poverty threshold). A smoking status variable was also assigned to each of the participants based on their measured serum cotinine levels. Smoker or Environmental Tobacco Smoke (ETS) were assigned to those participants whose serum cotinine levels were greater than 14 ng ml1, and the others as non-smokers (Lin et al., 2008). The collected exposure questionnaire data provided participants’ responses to 30 questions (i.e. 30 variables) related to housing characteristics as well as time and activity patterns of participants during the exposure monitoring period. Two variables (wearing the exposure badge at all times and hours badge not worn) were excluded from the data analysis, since they were used to perform the data cleaning procedure for the corresponding personal air measurements. Therefore, there are total of 34 variables in the socioeconomic, demographic, housing, and time-activity factors for examining their relationships with VOC exposures. 2.2. Statistical analyses The NHANES data were collected based on a complex sampling design with sampling weights for generating national estimates. In the current study, we conducted unweighted statistical analyses, since the analysis results were not used for the estimation of population parameters that can be generalized as national estimates. Instead, they were used from the exploratory perspective for identifying significant VOC exposure factors, which can be used to generate hypotheses about possible significant exposure sources and pathways in future studies. In order to reveal the impacts of socioeconomic and demographic factors on VOC exposures, univariate analyses were conducted first for examining group differences of personal exposures stratified by socioeconomic, demographic, and smoking attributes. Student’s t-test was used to examine differences between two groups. The Bonferroni adjustment was used to examine differences for multiple comparisons. However, when all of the predictor variables (including socioeconomic, demographic, housing, and time-activity factors) were involved in data analysis, several characteristics of the NHANES VOC dataset need to be recognized. First, the dataset include a large number of variables, which are of disparate type (i.e. continuous and categorical). Second, high correlations may exist among exposure factors (collinearity), as well as non-linear and interaction effects between exposure factors for influencing VOC exposures. These characteristics make it difficult to perform data analysis using conventional statistical techniques. The approaches of Classification and Regression Tree (CART) and Bootstrap Aggregating (Bagging) were used in the current study for resolving above challenges in analyzing the NHANES VOC dataset. CART was used to explore potential non-linear and interaction effects among exposure factors on personal exposures to VOCs. The CART models are comprised of a collection of rules that partition the space of dependent variable as a function of predictor variables (Breiman et al., 1984). The rules are constructed by a recursive partitioning procedure using a ‘‘training dataset’’ containing values of dependent and predictor variables. The over-fitting of CART
2298
S.-W. Wang et al. / Atmospheric Environment 43 (2009) 2296–2302
model can be prevented through K-fold cross-validation (CV) as follows: (1) randomly split the training dataset into K subsets (typically K ¼ 10 as used in the current study) of approximately equal size; (2) leave out each subset in turn, construct a CART model using the remaining subsets, and repeat K times; (3) identify the optimal CART model by selecting the one with the best predictive performance on the observations that were left out in the construction of the model. The CART method has been used to characterize associations of biomarkers of exposure with environmental, dietary, demographic, and activity variables for benzene and lead (Roy et al., 2003). The Bagging algorithm (Breiman, 1996) was used to resolve the issue of collinearity and obtain a data-driven importance measure of predictor variable. It used bootstrapping to generate multiple training sets. The base algorithm (such as CART) was then used to create a different base model instance for each bootstrapped training set. Combining multiple instances of the same model type can reduce the variance and drastically improve predictive performance. The best enhancement by Bagging is when the model instances are very different from each other. There are two parameters required to be determined for conducting the Bagging analysis: the probability (P) used for generating the bootstrapped samples and the number of times (N) for performing bootstrapping. If we use small numbers of P (such as 0.1, or 0.2), the size of bootstrapped samples would be too small and the constructed tree models would not be robust. On the contrary, large numbers of P (such as 0.8, or 0.9) would result in similar bootstrapped samples, which could not provide the instability needed by the Bagging approach. Through the iterative searching process, we found the optimal parameter value of P as 0.3, since the constructed ‘‘Bagging Trees’’ had the best performance in predicting the personal air concentrations of the eight VOCs through the cross-validation procedure. For developing the ‘‘importance measure’’ of predictors, we counted the number of times out of the N constructed optimal CART models that identified this variable as the primary predictor. The ‘‘importance measure’’ provides a quantitative scale about the significance of a predictor contributing to the predictive performance on the response variable. The higher the counts, the more significant the predictor is for determining personal exposure. The parameter of N should be large enough for producing stable Bagging analysis results. We set N as 1000 in the current study, and identified the predictors with more than 50 counts of importance measure as significant exposure factors, which is equivalent to the statistical significance level of 0.05. Before conducting univariate, CART, and Bagging analyses, the data cleaning procedure was performed for excluding the personal air measurements collected with significantly less sampling time based on the questionnaire responses to ‘‘wearing the exposure badge at all times’’ and ‘‘hours badge not worn’’. Only a small percentage of participants (less than 5%) were excluded. Natural logarithmic transformation was applied to the measured personal air concentrations, since the distributions were skew to the right. Outliers were also identified through normal probability plots of the log-transformed data and then excluded for data analysis. The software packages including MATLAB, R, and SAS were used to perform data analyses in this study. 3. Results and discussion 3.1. Univariate analyses of VOC exposures vs. socio-demographic factors 3.1.1. Age and gender The effect of age was only revealed in chloroform exposures with significantly negative correlation, indicating that young
participants had higher chloroform exposures. Significant differences in personal exposures to benzene, ethylbenzene, o-xylene, and m-,p-xylene were observed between males and females, with males having higher exposures than females (see Table 2). Edwards et al. (2006) reported similar findings of gender differences in exposures to traffic-related aromatics (i.e. ethylbenzene, o-xylene, and m,p-xylene) in the EXPOLIS study. However, Edwards et al. (2006) did not find significant gender differences in benzene exposures. Schweizer et al. (2007) reported that men spent less time in home than women, and men tended to work away from home in the EXPOLIS study. Further, Graham and McCurdy (2004) suggested using age and gender as ‘‘first-order’’ attributes to identify statistically significant different cohorts with respect to the time spent indoors, outdoors, and in-vehicles by analyzing the USEPA Consolidated Human Activity Database (CHAD). In this study, we found that males spent significantly more time at work/ school than females (male mean: 10.37 h, female mean: 7.57 h, pvalue: 0.0001); males also spent significantly more time outdoors than females (male mean: 10.40 h, female mean: 6.94 h, p-value < 0.0001) during the exposure monitoring period. This finding might suggest that males could spend more time in commuting to work, resulting in elevated exposures to traffic-related aromatics. Significant gender differences were not observed in exposures to toluene and the three chlorinated chemicals (chloroform, PERC, and PDB). 3.1.2. Race/ethnicity Significant differences were observed in exposures to benzene and the three chlorinated chemicals (chloroform, PERC, and PDB) among different race/ethnicity groups (see Table 3). For benzene, Mexican Americans had higher exposures than both non-Hispanic whites and blacks. Churchill et al. (2001) reported that Mexican Americans were less likely to have elevated blood benzene levels than non-Hispanic whites from their analysis on the NHANES-III blood VOC data. However, as pointed out by Lin et al. (2008), the blood–air relationships of BTEX were influenced by factors such as age, gender, BMI, and smoking. Thus, higher benzene exposures would not necessarily correspond to higher benzene blood levels. For chloroform, Non-Hispanic blacks had higher exposures than non-Hispanic whites and Mexican Americans. Churchill et al. (2001) reported a similar finding that non-Hispanic blacks were more likely to have elevated chloroform blood levels than nonHispanic whites in the NHANES-III blood VOC study. Churchill et al. (2001) also indicated the protective effect of rural residence due to the increased well-water use for less exposure to chlorine-treated water. By examining the questionnaire responses to ‘‘description of street where you live’’, we found that non-Hispanic whites had significantly higher proportion of rural residence than nonHispanic blacks (white proportion: 0.20, black proportion: 0.08, p-value: 0.001), thus resulting in lower chloroform exposures.
Table 2 Gender differences in personal VOC exposures (ug m3). Chemical
benzenea toluene ethylbenzenea o-xylenea m,p-xylenea chloroform tetrachloroethene 1,4-dichlorobenzene
Male
Female
p-value
Mean
N
Mean
N
6.33 28.19 6.54 6.46 19.43 2.65 7.71 30.11
295 282 289 288 290 296 292 291
4.67 23.42 3.99 3.84 11.44 2.90 3.23 43.22
352 347 347 352 352 355 350 350