Dec 31, 1999 - The research makes use of Anaconda, Montana soil ingestion data ...... Anaconda, Montana is the site of a copper ore processing plant that ...
Development of Exposure Distribution Parameters for Use in Monte Carlo Risk Assessment of Exposure Due to Soil Ingestion Final Report December 1999 Edward J. Stanek III 1 Edward J. Calabrese2 Martha Zorn1
1
Department of Biostatistics and Epidemiology Department of Environmental Health University of Massachusetts at Amherst Amherst, Massachusetts 2
Sponsor: USEPA R8 Ecosystems Protection&Remediation. Contract: LOR056 1998 T 08L, TIN:043-16-7352
Abstract and Summary Monte Carlo analysis for exposure due to soil ingestion requires knowledge of soil ingestion distributions between subjects, and distributions over time and between days within subjects. Such distributions are best estimated from mass-balance soil ingestion studies such as the study in Anaconda, Montana, conducted on a random sample of 64 children between the ages of 1 and 4 residing on a super fund site. Most analyses of mass-balance soil ingestion study data has been focused on subject specific estimates of soil ingestion over study time periods, and in particular on the 95% ingestion for a subject, not on parameters that represent the long term soil ingestion distribution and uncertainty in the distribution estimates. This research develops appropriate distribution estimates for soil ingestion for use in Monte Carlo analysis. The research makes use of Anaconda, Montana soil ingestion data on children to construct estimates of percentile estimates and uncertainty for use in Monte Carlo exposure assessment due to soil ingestion, and to provide recommendations for use of the resulting parameters. In addition, this research identifies the potential impact of factors such as study design, transit time assumptions, source error, miss-specified play areas, and trace element food absorption on soil ingestion estimates.
danc17.doc 12/31/99
i
Table of Contents .....................................................................................................Page Abstract and Summary...............................................................................................i List of Tables .............................................................................................................ii List of Figures ............................................................................................................iii I. Introduction............................................................................................................1 A. Research Objectives............................................................................................2 B. Overview of the Report ......................................................................................5 II. Background............................................................................................................5 A. Preliminary Studies .............................................................................................6 III. Results ..................................................................................................................8 A. Daily Estimates of Soil Ingestion among Anaconda Children............................9 B. Strategies for Reducing Source Error..................................................................13 Introduction..........................................................................................................13 Results ..................................................................................................................15 Discussion ............................................................................................................23 C. Factors that Contribute to Bias and Uncertainty in Soil Ingestion......................24 Biasing Factors for Simple Soil Ingestion Estimates...........................................25 Construction of the Simulation ............................................................................25 Results of the Simulation.....................................................................................28 The Impact of Study Duration...........................................................................28 The Impact of Ingestion of Soil from Neighbor's Yards and/or Absorption of Trace Elements from Food.....................................................................33 Sensitivity of Soil Ingestion Estimate to transit Time Assumptions.................36 Estimates of the Long Term Soil Ingestion Distribution and Uncertainty...........45 D. Recommendations for Monte Carlo Analyses ....................................................47 IV. Discussion............................................................................................................48 V. Further Research Needed .....................................................................................50 References..................................................................................................................52 Appendices.................................................................................................................54 Appendix A. Daily Soil Ingestion Estimates for Children at a Super-fund Site...55 Appendix B. Biasing Factors for Simple Soil Ingestion Estimates in Mass Balance Soil Ingestion Studies......................................................92 Appendix C. Soil Ingestion Distributions for Monte Carlo Risk Assessment In Children ....................................................................................129
danc17.doc 12/31/99
ii
List of Tables
Page
Table 1. Distribution of Total number of Days with a Fecal Sample, and Total Number of Days with Soil Ingestion Estimates Among 64 Children.........15 Table 2. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children.............................................16 Table 3. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children.............................................17 Table 4. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Amherst Outlier Criteria...............................................................19 Table 5. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Amherst Outlier Criteria...............................................................19 Table 6. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Factor Score Outlier Criteria .......................................................19 Table 7. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Factor Score Outlier Criteria ........................................................20 Table 8. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Tukey Outlier Criteria..................................................................20 Table 9. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Tukey Outlier Criteria...................................................................20 Table 10. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates where criteria were outliers..........................................................................21 Table 11. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates where criteria were outliers..........................................................................21 Table 12. Summary of Mean and Variance Component Estimates of Daily soil Ingestion assuming Various Outlier Criteria for 64 Anaconda Children (Source: anced99p25.sas). ...........................................22 Table 13. Cumulative Distributions of Average Soil Ingestion Developed from Simulations using Various Soil Ingestion Distributions for 4 and 7 days studies. ....................................................................................31 Table 14. Cumulative Distributions of Average Soil Ingestion Developed from Simulations using Soil Ingestion Distribution #6 1, taking into account absorption and ingesting different soil. ...........................................34 Table 15: Number identified as outliers.....................................................................41 Table 16. Distribution of Daily Soil Ingestion (md/d) over 7 Days for 64 subjects in Anaconda, MT. based on 7 trace elements (excluding Ti) by varying food transit time ....................................................44
danc17.doc 12/31/99
iii
Table 17. Distribution of Long Term Daily Average Soil Ingestion (mg/d) based on 7 Trace Element estimate on up to 7 Days for 64 subjects in Anaconda, MT. assuming a 28 hour food transit time....................................................................................................46
danc17.doc 12/31/99
iv
List of Figures
Page
Figure 1. Cumulative distribution of 7 day average soil ingestion estimates and 95% CI around quantiles for 64 children from the Anaconda Study........................12
danc17.doc 12/31/99
v
I. Introduction
A principal route of exposure to contaminants in soil among children is direct ingestion of contaminated soil1-3. Estimates of the extent of such exposure are based on soil ingestion studies such as the study conducted in Anaconda, Montana among a stratified random sample of 64 children. The study was a 7 day mass-balance study with measures of trace element intake from food, fecal output, and soil and dust obtained on consecutive days on each subject. Trace elements used in the study include Al, Cr, Ce, La, Nd, Si, Ti, Y, and Zr (although Cr was dropped due to analytic contamination). Among a sample of 30 children, As intake and fecal output was also measured. Soil ingestion estimates for the children and adults have been constructed based on a mass-balance approach1,4. Of principal concern has been variability between trace element estimates of soil ingestion for individual children. Subsequent research indicated that one source of such variability is heterogeneity of certain trace element concentrations in soil by particle size 1,2. Between element variances can be reduced by calibrating particle size for soil concentrations. Exposures of particular interest focus on the upper percentiles of the exposure distributions. A common starting point for estimating these exposures is the empirical soil ingestion distribution. If daily soil ingestion rates did not vary between days for a subject and could be measured without error, such empirical estimates of the soil ingestion distribution would be reasonable. However, both of these assumptions are known to be false. Soil ingestion does vary between days (as clearly demonstrated by the Pica child5-6). In addition, there is uncertainty, as evidenced by variability between different trace element estimates of soil ingestion on a given day. The impact of such variability is to broaden the
danc17.doc 12/31/99
1
soil ingestion distribution, resulting in lower percentile estimates being too low, and upper percentile estimates being too high for soil ingestion. Characterizing this variability is necessary to eliminate the artificial spread prior to simulating soil ingestion in Monte Carlo risk assessment analysis.
A. Research Objectives
This project will build on the preliminary studies of soil ingestion using data collected in the Anaconda children’s soil ingestion study to more adequately characterize soil ingestion exposure for use in Monte Carlo risk assessment. First, estimates of daily soil ingestion among Anaconda children are constructed. The estimates are based on assumptions concerning transit times (28 hours), and soil particle size ingestion (< 250 µm)1,3. These estimates describe the short term average daily soil ingestion distribution comparable to daily soil ingestion estimates for the Amherst, MA study7,8. While important for comparison purposes, the distribution does not account for uncertainty in the estimates, nor describe the distribution of long-term soil ingestion. We assess the limitations in the daily estimates by (1) developing estimates of uncertainty in daily estimates based on multiple trace element estimates; (2) evaluating strategies for identifying/reducing biasing factors due to “source error” in element specific estimates on a daily basis; and (3) examining the sensitivity of daily soil ingestion estimates to transit time assumptions and other biasing factors. In this process, we develop a simulation model for soil ingestion studies and use it to characterize biasing factors. Finally, we develop a methodology to provide cumulative estimates of long
danc17.doc 12/31/99
2
term soil ingestion, and its uncertainty. On the basis of these results, we provide recommendations for use of Anaconda soil ingestion estimates in Monte Carlo analyses. The first objective of this research is to develop daily soil ingestion estimates, and estimates of uncertainty in daily soil ingestion estimates based on Anaconda, Montana soil ingestion study data. This objective is achieved by developing daily estimates of soil ingestion for children in the Anaconda, Montana soil ingestion study. In the process of estimating daily soil ingestion, we estimate the distribution of average soil ingestion for children over the study period. In addition, we estimate the within subject distribution of soil ingestion from day to day (for a given child). Finally, the estimates of daily soil ingestion enable uncertainty in the daily estimate to be estimated based on variability between trace element estimates. We use a methodology similar to the methodology used to construct daily soil ingestion estimates in the Amherst, MA soil ingestion study7. Simple outlier criteria (based on food/soil ratios and extreme deviations) are used to evaluate the sensitivity of these estimates to high transit time error, or bias. The estimates are used to discuss soil ingestion distributions needed for a Monte Carlo analysis. We conclude this development by summarizing how soil ingestion distributions may be used for different time periods of possible ingestion, and providing similar distribution estimates for children in the Amherst, MA soil ingestion study. The second objective is to refine our ability to identify biasing factors in soil ingestion estimation. Such factors appear to be present in the Anaconda study, and may be due to similar sources as identified in the Amherst children’s study9. This aspect of the research seeks to identify patterns in residuals for daily estimates among trace elements. Clustering of residuals for trace elements may occur for several reasons: ingestion of trace
danc17.doc 12/31/99
3
elements from non-food, non-soil source; ingestion of different particle size soil with different element concentrations. Each of these potential explanations is investigated in this portion of the research. The results are applied to refine estimates for use in the Monte Carlo analysis. The third objective is to quantify variability and uncertainty in estimates of the cumulative soil ingestion distribution. We concentrate in three areas to achieve this objective. First, we develop a simulation model for a mass-balance soil ingestion study that allows aspects of the study design, and certain assumptions underlying the mass-balance approach to be evaluated. Using this simulation, we evaluate biases for the common strategy of using the average soil ingestion estimate per person, and the distribution of these average estimates. Next, we evaluate the sensitivity of daily soil ingestion estimates to the transit time assumption (of 28 hours) by re-computing the daily soil ingestion estimates using six hour and 48 hour transit time assumptions. Finally, we use non-parametric variance estimates of the median soil ingestion on a day to evaluate the long term distribution of soil ingestion cumulative distribution for children. The resulting estimates are best linear unbiased predictors. The final objective of this research is to summarize the results and provide recommendations for Monte Carlo analyses. The focus of this portion of the research is to provide guidance that is useful for risk assessors incorporating results of this research. In addition to the recommendations, we will assess limitations of the recommendations, and provide suggestions for future research.
danc17.doc 12/31/99
4
B. Overview of the Report
This report summarizes the results of the research objectives. We begin by providing a brief background to mass balance soil ingestion studies. This background introduces the strategy for estimating soil ingestion, and relates it to discussions in the literature on limitations of the estimates. Next, we present results on each of the main objectives of the research. This presentation is organized in the order of the research objectives. For some objectives, manuscripts have been developed and submitted for publication in the peer reviewed literature. We include such manuscripts in appendices to this report, and reference them in the results sections. Finally, we discuss the results of this research, and identify areas where more research is needed.
II. Background: Estimating Soil Ingestion in Children: The Mass-Balance Methodology
Historically, qualitative attempts at estimating childhood soil ingestion have been made by several groups10-12, with estimates derived from a set of “reasonable” assumptions, or from a set of assumptions accompanied by some behavioral observations. Such estimates have varied widely, and have been difficult to quantitatively summarize. For these reasons, quantitative methods for estimating the amount of soil ingested by children using a mass balance approach have been developed, and have become standard13. The mass balance approach for soil ingestion is based on using a soil trace element as a marker of soil
danc17.doc 12/31/99
5
ingestion. If a non bio-available trace element exists that occurs uniformly in soil (and not in food), then simply measuring the quantity of the trace element in fecal samples in children, along with the soil concentration, will enable estimation of the amount of soil ingested by children. In practice, such “perfect” trace elements do not exist, since small quantities of trace elements occur in food, concentrations of trace elements in soil are not uniform, and trace elements may have some bio-availability. These considerations have lead to use of multiple trace elements, where the eligible trace elements have low bio-availability and are present in high concentrations in soil and low concentrations in food. Furthermore, to reduce the impact of trace element ingestion from food, mass balance soil ingestion studies have collected duplicate food samples over comparable time periods as fecal sample collections, and subtracted the quantities of trace elements from food prior to calculating the ingested soil amount based on the residual fecal trace element amounts. Since concentrations of trace elements in soil vary by location, a typical quantitative soil ingestion study measures trace element concentrations in food, fecal samples, and soil for each study subject, along with the absolute amount of food and fecal samples for each study subject. Such measures are made over time, with time periods varying up 7 consecutive days (as in the soil ingestion study in Anaconda, Montana 1-2).
A. Preliminary Studies
Initial soil ingestion estimates for Anaconda children have been constructed as simple short term average daily soil ingestion calculated from a mass-balance equation for
danc17.doc 12/31/99
6
each child. Different estimates have been formed using different trace elements. Protocols for soil concentration measurement originally were based on soil passing through a 2mm sieve, but subsequent protocols were developed to estimate soil concentrations based on a 250 µm sieve, 100 µm sieve and a 50 µm sieve. Evidence for ingestion of smaller particle size soil is based on enhanced consistency of soil ingestion estimates based on different trace elements for a subject using a particular particle size concentration2,3. Certain elements (Ce, La, Nd) concentrate in particles less than 250 µm, but do not concentrate further among smaller particle sizes. Such comparisons (based on average daily estimates over a 7 day period for each child) have indicated that closest correspondance between element specific estimates are based on soil sizes sieved < 250 µm. Soil concentrations in soil that has been sieved < 250 µm has been used to estimate soil ingestion in Anaconda children due to the improved reliability of trace element estimates3. Daily soil estimates are constructed as part of this research using this size soil fraction, along with a protocol similar to the one used for the soil ingestion study on Amherst, Ma. children7,8. Daily estimates are constructed by linking the food to fecal samples using an assumed 28 hour transit time. Average trace element amounts in food are used to estimate trace element input on days prior to the start of the study for use in the massbalance estimates. Trace elements are accumulated from food in intervals between fecal samples so as to be removed from fecal sample trace element totals. The procedure results in multiple daily estimates of soil ingestion for each subject. This methodology is adapted and applied to data from the Anaconda, MT soil ingestion study.
danc17.doc 12/31/99
7
III. Results
We present results for each of the main research questions in this section. First, we present development of daily soil ingestion estimates for children in Anaconda, Montana. A manuscript that summarizes these results has been submitted for publication. Our discussion highlights the results of the manuscript. Next, we discuss methods that were explored to identify source error in daily soil ingestion estimates. Source error may potentially have a large role in both bias and uncertainty in soil ingestion estimates. The results we present discuss use of factor analysis and other strategies to identify potential source error. The third section presents results on the sensitivity of soil ingestion estimates to a variety of assumptions inherent in the methodology. First, we present details on the development of a soil ingestion simulation model. The simulation takes as input parameters for the concentration distribution of trace elements in food, and soil; parameters for the food intake distribution; parameters for the soil ingestion distribution; and parameters for the transit time distribution. Using such parameters, data similar to that obtained by previous soil ingestion studies are generated. Although the simulated data is artificial, since the true soil ingestion is known (via specification in the simulation), the ability of current study designs and estimation strategies to characterize soil ingestion can be evaluated. The results of these evaluations are presented, with a manuscript summarizing the results included in the Appendix B. Next, we present results of daily soil ingestion estimates based on different transit times of 12 and 48 hours. Finally, we evaluate variability and uncertainty simultaneously for the cumulative soil ingestion distribution using small sample estimates of the variance of the median soil ingestion on a day. These results are summarized in a
danc17.doc 12/31/99
8
manuscript in the Appendix C. In conclusion, we discuss the implications of these results relative to Monte Carlo use for risk assessment.
A. Daily Estimates of Soil Ingestion among Anaconda Children
The first objective of the research is characterization of the soil ingestion distribution among children based on Anaconda, Montana soil ingestion study data. This objective is met by constructing daily soil ingestion estimates for the children. The daily estimates, and the distribution of estimates are characterized using empirical distributions, and variance components. Results of these analyses have been submitted in a manuscript for publication in the peer reviewed literature (Appendix A). The Anaconda study is a 7-consecutive day mass balance soil ingestion study conducted on a stratified random sample of 64 children between the age of 1 and 4 living in Anaconda, Montana. Details of the study are presented elsewhere1-3. Briefly, duplicate food sample and fecal output were collected on each day for seven days on study subjects. Food sample collection began one day prior to fecal sample collection. In addition, soil samples were collected from each child's yard, with preference given to soil collected in the child's activity area. Trace element concentrations in soil with particle size < 250 :m was used based on previous research2-3. These samples were processed and analyzed for eight trace elements (Al, Ce, La, Nd, Si, Ti, Y, and Zr). Food samples (and fecal samples) were combined each day prior to processing. Daily soil ingestion estimates were constructed by matching food intake of trace elements to fecal output of trace elements on each day. The matching was made by
danc17.doc 12/31/99
9
calculating the trace element input in a 24 hour period that corresponded to trace element output. We used the observed pattern of fecal sample delivery (the distribution of days between fecal samples), to estimate the transit time from food to fecal sample (ie. 28 hours). When fecal samples were reported to have been missed [on 10 of 322 days (3.1%)] or partially missed [ on 9 days or 2.7%] by participant's parents, we imputed fecal weights prior to constructing daily soil ingestion estimates. Use of soil with particle size < 250 :m accounted for differential concentration of the trace elements Ce, La, and Nd by particle size, and was consistent with the assumption that fine soil particles make up much of the soil ingested. Criteria were developed to identify possible source errors, and select trace elements for ingestion estimates. When including Ti as a trace element estimator of daily soil ingestion, the standard deviation between trace element estimates on a subject day was eight time larger than when the soil ingestion estimate based on Ti was excluded. For this reason, soil ingestion estimates based on Ti were not included when forming daily soil ingestion estimates. Two additional strategies were used to identify outliers potentially due to source error. The first strategy is based on identifying element estimates for a subject-day that exceed the third quartile (or are below the first quartile) by more than three times the interquartile range. These estimates correspond to far-out values as described by Tukey in a box plot14. The second strategy made use of the outlier criteria used in the Amherst study7 . A trace element estimate was identified as an outlier if it differed from the median subject-day specific soil ingestion estimate by varying amounts (which ranged from 2g/d for median estimates of 10g/d, to 50 mg/d for median estimates of 50 mg/d or less). This criteria was based on relative standard deviations that ranged from 20% (for soil ingestion estimates of
danc17.doc 12/31/99
10
10g/d) to 100% (for soil ingestion estimates of 50 mg/d or less)7. Using the Tukey criteria, 0.45% of the element-subject-days were identified as outliers, whereas using the Amherst study outlier criteria, 31.9% of the element-subject-days were identified as outliers. Eliminating outliers for trace elements that differ from other trace element estimates due to biasing factors such as source error (and not due to soil ingestion) will reduce bias in soil ingestion estimates. However, eliminating outliers identified by large differences that arise from sampling variability (and not biasing factors) will result in understating uncertainty in soil ingestion estimates. In light of the fact that the two outlier criteria identified substantially different numbers of outliers, and that the Tukey criteria identified less than 0.5% of the estimates as outliers, we choose not to exclude outliers when estimating daily soil ingestion. This decision is conservative since inclusion of outliers due to non-soil sources may elevate the upper percentiles of the soil ingestion distribution. The results of daily soil ingestion estimates are presented in detail in a submitted manuscript (Appendix A). Of particular interest are estimates of the distribution of daily soil ingestion, plus estimates of uncertainty of the distribution. The manuscript in Appendix A develops estimates of the empirical daily soil ingestion distribution based on 7 day average estimates of the median (of the trace element specific soil ingestion estimates on a subjectday) for 64 subjects. The strategy used is comparable to the strategy used in the Amherst, MA soil ingestion study7, and the results are compared to comparable results in that study. This manuscript integrates the results of the Amherst and Anaconda soil ingestion studies in providing an estimate of the cumulative distribution of 7-day average daily estimates of soil ingestion. The manuscript is limited in that the robustness of the median is not captured in the uncertainty estimate in the mixed models used to estimate variance components. This is a
danc17.doc 12/31/99
11
result of the mixed model estimates being based on the average trace element estimate on a day, as opposed to the median estimate. We focus on estimation of the 95% soil ingestion for a child in the study. We consider a set of estimates based on the different outlier critieria. First, we construct estimates based on the simple estimated soil ingestion for the subject, taking as the estimate the value of the 61st subject in a rank ordering of estimates (since 61/64=0.9532). A 95% interval estimates for this estimate (see Conover15, p112) ranges from the 58th to the 64th estimate in the rank ordering. Such estimates are presented as a confidence band for the soil ingestion distribution in Figure 1.
danc17.doc 12/31/99
12
B. Strategies for Reducing Source Error
Introduction
Attempts to improve accuracy of soil ingestion estimates have focused on identifying and excluding potentially biased trace element estimates using outlier criteria based on the relative standard deviation7,16, and selecting more reliable trace elements soil ingestion estimates based on low food/soil ratios 17,18. Additional trace elements (Ce, La, and Nd) were included in the Anaconda study that were anticipated to occur minimally in food and thus be more reliable. The low food/soil ratios achieved for these elements (with the median daily amount ingested in food corresponding to less than 25 mg/d of soil) indicated the success of this effort. However, these trace elements occur in very small amounts in soil, and hence are increasingly suspect to error due to small non-soil, non-food ingestion. Such apparent source error was particularly evident for these elements, and motivated development of additional strategies for outlier identification. Ingestion of non-food, non-soil trace elements will positively bias estimates of soil ingestion since the trace element will erroneously be attributed to ingested soil. On the other hand, elimination of high soil ingestion values that are not outliers, but reflect actual soil ingestion will bias soil ingestion estimates down. Since positive biases are possible if not enough effort is made to identify and eliminate source error, and negative biases are possible if the elimination of unusual trace element estimates is too aggressive, we studied alternative strategies for identifying outliers. We identifying outliers in the study in four ways.
danc17.doc 12/31/99
13
The first strategy that we used corresponded to the strategy used in the Amherst, MA soil ingestion study7. When forming daily estimates for children in the Amherst study, individual trace element estimates were identified as outliers if they differed from the median subject-day specific soil ingestion estimate by varying amounts based on the relative standard deviation (and ranging from 20% for estimates of 10g/d, to 100% for estimate of 50 mg/d or less). Details are given by Stanek and Calabrese7. This strategy resulted in a relatively large proportion of ‘outlier’ estimates (36%). The second strategy searched for possible non-food, non-soil sources errors by fitting (orthogonal) principal component to the multiple trace element estimates on days. Trace element factor loadings indicated that the first principal factor represented common trace element soil ingestion, with subsequent factor loadings identifying days with other trace element ingestion patterns suggestive of non-food, non-soil sources. We repeated the factor analysis excluding Ti, since Ti was an important contributor to many of the factors 2-6, and since large food/soil ratios were present for Ti on days. The factor analysis based on the remaining seven trace elements was used to identify outliers, with Ti estimates always considered to be outliers. We used Factors 2-7 to identify subject-days with highly discrepant element estimates by identifying subject-days where the a standardized factor score (for factors 2-7) was larger in absolute value than 2, and where the score was far-out (such that the score exceeded the third quartile (or was below the first quartile) by more than 2 times the inter-quartile range). This criteria identified 24% of the element-specific estimates as ‘outliers’. The third strategy is based on identifying element estimates for a subject-day that exceed the third quartile (or are below the first quartile) by more than 2 times the inter-
danc17.doc 12/31/99
14
quartile range. These estimates correspond to far-out values as describe by Tukey14 in a box plot. This strategy identified 7% of the data as ‘outliers’. A final outlier strategy consisted of requiring an element estimate to be classified as an outlier on all three of the previous criteria. This strategy identified 6% of the element estimates as ‘outliers’.
Results
Using the methods outlined above, soil ingestion estimates were calculated for each trace element for each of the 321 days with fecal samples collected on the 64 children in the Anaconda study (Table 1). In total, these fecal samples spanned 422 ingestion days, or 94% of the possible 64*7=448 days in the 7 day study. For two children, soil concentrations among particle sizes less than 250 µm were not available due to small soil sample size. For these two subjects, we used trace element concentrations in soil < 2mm to calculate ingestion estimates. For one subject on two days, analytic results were available only for the trace elements Al, Si, and Ti.
Table 1. Distribution of Total number of Days with a Fecal Sample, and Total Number of Days with Soil Ingestion Estimates Among 64 Children Total Days with Total Days with Number Fecal Samples Soil Ingestion Estimates of days # Chlidren Percent # Children Percent _________________________________________________________ 1 1 1.6 0 0 2 1 1.6 0 0 3 10 15.6 1 1.6 4 14 21.9 0 0 5 12 18.8 5 7.8 6 10 15.6 12 18.8 7 16 25.0 46 71.9 Source: anced99p13.sas on anc5-documents 2/3/99
danc17.doc 12/31/99
15
We constructed estimates of soil ingestion for each child, and then summarize the distribution of the estimates in a similar manner as Stanek and Calabrese7. Estimates for individual trace elements in Table 2 are constructed by calculating the median soil ingestion for each child over days, and then tabulating the distribution of these median estimates among children. The column labeled overall estimate in Table 2 is calculated by first estimating the median soil ingestion for each subject-day among the trace element estimates. The median of these estimates over days for a child is then constructed, and the distribution of these medians summarized among children. Estimates for individual trace elements in Table 3 are constructed by calculating the average soil ingestion for each child over days, and then tabulating the distribution of these average estimates. Estimates in the column labeled Overall are calculated by first calculating the average trace element estimate for each subjectday. The average of these estimates over days is then calculated for each subject, and this distribution tabulated.
Table 2. Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children OVERALL Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 11 -16 2 31 85 111 155
AL
CE
LA
ND
SI
TI
Y
ZR
64 -10 -33 -11 6 12 41 524
64 24 -2 14 35 97 137 180
64 38 12 22 52 132 139 232
64 77 5 45 122 261 331 401
64 -29 -43 -26 -12 7 21 214
64 -315 -411 10 125 466 618 2269
64 33 -9 25 78 136 147 377
64 -38 -84 -44 -13 74 126 204
Source: anced99p13.sas on anc5-documents 2/3/99
danc17.doc 12/31/99
16
Table 3. Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children OVERALL Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 -42 -83 13 100 183 202 471
AL
CE
LA
ND
SI
TI
Y
ZR
64 -1 -42 -4 14 68 87 411
64 45 6 24 75 146 171 324
64 78 20 42 106 232 281 463
64 107 17 70 170 307 384 469
64 -19 -40 -22 3 48 77 287
64 -565 -559 16 303 825 1088 3008
64 40 -8 27 85 208 233 377
64 -19 -70 -30 9 117 132 395
Source: anced99p14.sas on anc5-documents 2/3/99
Results in Table 2 for the column labeled Overall have the following interpretation. For 95% of the children, the median soil ingestion estimate was 111 mg/day on at least half the observed days. In a similar manner, for 50% of the children, the median soil ingestion estimate was 2 mg/day on at least half the observed days. A similar interpretation can be given to individual trace element estimates. Notice the large difference between the distribution of soil ingestion estimates based on Ti compared to the other trace elements in Table 2. Only 3.4% of the food/soil ratios are greater than 225 mg/d for elements not including Ti, whereas for Ti, 46% of the food/soil ratios are greater than 225 mg/d. The relatively large contribution of Ti from food results in much more highly variable soil ingestion estimates, when compared with estimates based on other daily elements. For example, 96.3% of the daily estimates based on trace elements other than Ti fall in the range -250 mg/d to +250 mg/d, whereas only 49.5% of the daily estimates using Ti are in this range. Although half of the daily estimates of soil ingestion based on Ti are within the range of other element estimates, the estimates are likely to be less reliable as compared to other trace elements since larger deviations may arise as a result of large transit time errors. For this reason, we exclude soil ingestion estimates based on Ti from further consideration when
danc17.doc 12/31/99
17
summarizing the distribution. We fit a mixed model to other element-subject-day estimates to form restricted maximum likelihood estimates of the variance in soil ingestion between elements, between days, and between subjects. These estimates indicate that the variance between days is approximately four times the variance between subjects, and the variance between elements (on a given subject-day) is about double the variance between days. For each subject, we use the median estimate for a day to form an average estimate for the subject (over days), and the median estimate for a subject (over days). We then calculate the deviations in soil ingestion per day, first taking deviations about the average soil ingestion for a subject, and second taking deviations about the median soil ingestion for the subject. Finally, we construct the distribution of these average soil ingestion estimates, and median soil ingestion estimates over the subjects. We also construct the distribution of estimates similar to Table 2 and 3 after first excluding Ti from the estimation. The estimates in the previous section are sensitive to potential elevated soil ingestion due to non-food, non-soil ingestion of trace elements. We construct similar distributions of estimates after first eliminating element specific estimates based on various outlier criteria, where the outlier criteria corresponds to the four definitions given in the methods section [Amherst study criteria (Table 4&5), Factor score criteria (Tables 6&7), Tukey criteria (Tables 8&9), and combined criteria (Tables 10&11)].
danc17.doc 12/31/99
18
Table 4.
Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Amherst Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 11 -15 3 31 87 116 155
AL
CE
LA
ND
SI
TI
Y
ZR
63 4 -18 -3 15 30 77 322
64 31 -3 11 35 98 137 448
63 36 8 21 49 127 153 269
53 67 0 19 54 194 366 928
62 -21 -35 -25 -10 19 56 102
33 24 -7 17 38 105 113 114
60 32 -9 18 41 140 206 383
56 11 -46 -6 48 128 179 309
Source: anced99p14.sas on anc5-documents 2/3/99
Table 5.
Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Amherst Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 24 -7 16 40 99 123 158
AL
CE
LA
ND
SI
TI
Y
ZR
63 11 -15 1 19 64 109 322
64 40 -0 18 53 112 137 448
63 46 13 26 78 130 161 269
53 74 0 23 61 194 366 928
62 -9 -32 -17 8 56 80 144
33 33 6 24 63 105 114 128
60 41 1 25 52 168 232 383
56 26 -35 5 69 157 221 309
Source: anced99p14.sas on anc5-documents 2/3/99
Table 6.
Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Factor Score Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 6 -19 2 26 67 77 136
AL
CE
LA
ND
SI
TI
Y
ZR
64 -21 -33 -12 5 10 15 60
63 18 -5 12 36 64 97 140
63 34 8 21 54 105 138 172
63 63 4 39 113 210 261 401
64 -35 -47 -27 -15 -3 7 26
0 . . . . . . .
63 23 -10 19 68 110 141 210
63 -48 -87 -48 -17 39 70 144
Source: anced99p15.sas on anc5-documents 2/3/99
danc17.doc 12/31/99
19
Table 7.
Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Factor Score Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 12 -21 7 30 82 107 140
AL
CE
LA
ND
SI
TI
Y
ZR
64 -20 -44 -9 7 20 34 71
63 27 -4 17 48 95 111 188
63 44 15 28 68 115 156 192
63 73 7 45 114 216 276 357
64 -30 -47 -24 -6 5 14 80
0 . . . . . . .
63 33 -11 23 77 128 150 299
63 -37 -79 -49 1 60 103 144
Source: anced99p15.sas on anc5-documents 2/3/99
Table 8.
Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Tukey Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 13 -16 8 29 74 115 155
AL
CE
LA
ND
SI
TI
Y
ZR
64 -9 -30 -11 6 20 41 524
64 24 -2 14 35 97 137 180
64 38 12 22 52 132 139 232
64 78 5 45 125 261 342 401
64 -27 -43 -26 -12 7 21 321
52 42 -12 27 101 174 505 578
64 35 -7 25 78 136 147 377
64 -35 -84 -44 -7 74 128 257
Source: anced99p16.sas on anc5-documents 2/3/99
Table 9.
Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates based on Tukey Outlier Criteria OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 35 -12 15 74 136 157 197
AL
CE
LA
ND
SI
TI
Y
ZR
64 -1 -37 -3 14 51 71 411
64 45 6 24 75 146 171 324
64 72 19 38 100 184 276 413
64 107 17 65 170 307 399 469
64 -18 -40 -22 3 48 77 300
52 59 -23 38 109 235 459 578
64 50 -5 27 85 208 233 377
64 -15 -70 -30 10 121 144 489
Source: anced99p16.sas on anc5-documents 2/3/99
danc17.doc 12/31/99
20
Table 10.
Distribution of median daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates where criteria were outliers OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 13 -16 9 29 74 115 155
AL
CE
LA
ND
SI
TI
Y
ZR
64 -9 -30 -11 6 12 41 524
64 24 -2 14 35 97 137 180
64 38 12 22 52 132 139 232
64 78 5 45 125 261 331 401
64 -27 -43 -26 -12 7 21 321
52 42 -12 27 101 174 505 578
64 35 -8 25 78 136 147 377
64 -38 -84 -44 -13 74 126 204
Source: anced99p17.sas on anc5-documents 2/3/99
Table 11.
Distribution of mean daily soil ingestion estimates per child (mg/day) for 64 Anaconda Children excluding element estimates where criteria were outliers OVERALL
Number of Subjects Mean 25th Percentile 50th Percentile 75th Percentile 90th Percentile 95th Percentile Maximum
64 34 -12 15 76 136 157 197
AL
CE
LA
ND
SI
TI
Y
ZR
64 -1 -39 -4 12 51 71 411
64 45 6 24 75 146 171 324
64 72 19 38 100 184 276 413
64 107 17 70 170 307 384 469
64 -18 -40 -22 3 48 77 300
52 59 -23 38 109 235 459 578
64 49 -8 27 85 208 233 377
64 -19 -70 -30 9 117 132 395
Source: anced99p17.sas on anc5-documents 2/3/99
The estimates of soil ingestion vary by the criteria that are used to exclude outliers. Using the Amherst criteria, the largest number of element specific estimates are eliminated. This exclusion criteria dramatically reduces the variability between trace elements for a subject-day. The estimated variance between trace elements on subject-day estimates (after eliminating Ti) was 22170 (mg/d)2 compared with 1398 (mg/d)2 after first eliminating estimates based on the Amherst criteria, nearly a 95% reduction. This reduction in variance is due in part to the criteria itself, since large deviations were eliminated by definition since they are large deviations.
danc17.doc 12/31/99
21
The Factor score criteria aimed to eliminate unusual patterns of trace element estimates when they occurred for a subject-day. This criteria did not result in as dramatic a reduction in the between element variance for a subject-day (resulting in a between element variance of 5807 (mg/d)2 ). Other outlier criteria did not reduce the between element variability for a subject-day to the same extent. This is due in part to the sensitivity of additional criteria to the size of the inter-quartile range of the available trace element estimates. If several trace elements are simultaneously subject to ingestion from non-food, non-soil sources, the inter-quartile range is large, and the between element variance will be large. We summarize variance components for the various analyses in Table 12. Note the similarity of estimates based on the Tukey criteria, all 3 criteria, and simply excluding Ti from the estimates. Other criteria (such as the Amherst criteria, or the Factor criteria) have more of an impact on the ingestion estimates and variance components, presumably due to elimination of some source error.
Table 12. Summary of Mean and Variance Component Estimates of Daily soil Ingestion assuming Various Outlier Criteria for 64 Anaconda Children (Source: anced99p25.sas). Outlier Criteria None
Mean mg/d -43.3
Subject Variance Day Variance (mg/d)2 (mg/d)2 71150 0
Element Variance (mg/d)2 2092845
Exclude Ti only
32.4
2647
9700
22170
Amherst Criteria
24.0
1510
6528
1399
Factor Criteria
12.8
1635
2782
5807
Tukey Criteria
34.6
2389
11961
20580
All 3 Criteria
34.0
2422
11945
20676
danc17.doc 12/31/99
22
Discussion
Attempts to reduce the impact of potential error due to non-food/soil trace element intake have been focused in the past on identifying inconsistent outlier values. For example, when forming daily estimates for children in the Amherst study, individual trace element estimates were identified as outliers if they differed from the median subject-day specific soil ingestion estimate by varying amounts based on the relative standard deviation (and ranging from 20% for estimates of 10g/d, to 100% for estimate of 50 mg/d or less). The Amherst study outlier criteria used was relatively stringent, and resulted in elimination of 1690 of the 3512 element specific estimates (48%)7. We use a simpler less stringent criteria here to evaluate the sensitivity of the estimates to potential source error. A problem that has plagued all soil ingestion studies conducted to date has been the broad range of soil ingestion estimates obtained using different trace element on the same subjects and time periods. For example, when based on the median soil ingestion estimate per day, the 95th percentile ranged from 21 mg/d to 618 mg/d across the elements (nearly a 30 fold range), while the estimates based on the mean for the 95th percentile ranged from 77 mg/d to 1088 mg/d (over a 14 fold range). The large range of values for the same set of subjects is a reflection of problems with the trace element soil ingestion methodology. In particular, while the methodology is successful in eliminating trace element quantities ingested from food from the fecal totals, amounts from non-food items are not eliminated. The fact that other sources of trace element intake exist is supported by a recent report on children’s mouthing behaviors19. Since the absolute quantities of the trace elements are small, even small amounts of ingestion from non-food/soil sources can have a
danc17.doc 12/31/99
23
large impact on the ultimate soil ingestion. The potential for such errors was recognized in earlier soil ingestion studies20 for Vanadium, and resulted in the trace element not being included as a tracer for the study of Anaconda children. However, a similar problem appears to occur for Ti (which has been included in all studies for historical reasons), and may also occur for La, Ce, and Nd (trace elements not used in previous mass-balance soil ingestion studies). The impact of non-food/soil trace element sources is inflation of soil ingestion estimates, and poor reliability.
C. Factors that Contribute to Bias and Uncertaintity in Soil Ingestion
The third section presents results on the sensitivity of soil ingestion estimates to a variety of assumptions inherent in the methodology. First, we present details on the development of a soil ingestion simulation model. Input for the simulation consists of parameters for the concentration of trace elements in food, and soil; parameters for the food intake distribution; parameters for the soil ingestion distribution; and parameters for the transit time distribution. Using such parameters, data similar to that obtained by previous soil ingestion studies are generated. Although the simulated data is artificial, since the true soil ingestion is known (via specification in the simulation), the ability of current estimation strategies to characterize soil ingestion can be evaluated. The results of these evaluations are presented, with a manuscript summarizing the results included in the Appendix B. Next, we present results of daily soil ingestion estimates based on different transit times of 6 and 48 hours. Finally, we evaluate variability and uncertainty simultaneously for the cumulative soil
danc17.doc 12/31/99
24
ingestion distribution using small sample estimates of the variance of the median soil ingestion on a day. These results are summarized in a manuscript in the Appendix C.
Biasing Factors for Simple Soil Ingestion Estimates
We present results that quantify uncertainty in soil ingestion estimates based on massbalance study designs. The results identify the importance of three possible sources of uncertaintity: the length of the study; ingestion of non-sampled soil; and bioavailability of food. For each source, the bias that results from estimating the cumulative soil ingestion distribution by subject specific average estimates is quantified. We use these results to discuss the extent to which variability can be separated from uncertainty in the context of available children's soil ingestion studies. The results are achieved by specifying a distribution of soil ingestion (known at the start), and then simulating other factors that would interfere with a simple objective measure of soil ingestion in a mass-balance study design.
Construction of the Simulation
The simulation takes advantage of what has been learned in previous mass-balance soil ingestion studies. The distribution of the amount of food ingested, the trace element concentration in food, the trace element concentration in soil, and the timing of fecal samples has been recorded in five children studies. These distributions characterize what has been found in the past, and may encompass the distribution of these factors likely to occur in
danc17.doc 12/31/99
25
similar future mass-balance studies. In contrast, there may be considerable question about the relative size of uncertainty and variability when characterizing the distribution of soil ingested between children and days. We simulate other factors using extensive data on the two most promising trace elements (Al and Si) collected from four mass-balance studies. For each subject, we develop distributions of trace element food intake and fecal output that mirror results seen in previous research. The distributions are used to simulate food intake and fecal output data for a hypothetical population of subjects using a study design, as well as trace element concentrations in soil. We then postulate a distribution of soil ingestion. A simulation is conducted that results in a snapshot of the hypothetical study data for the design. Since the true soil ingestion of the subjects underlying the data is known, we use the results to evaluate the accuracy and reliability of current estimates. By varying the assumed soil ingestion distribution, we characterize the extent to which current studies can separate variability from uncertainty when estimating soil ingestion. The simulation also allows these factors to be investigated at minimal cost. There are four stages in this study. First, using previous mass-balance study data, we characterized the distribution of trace element concentrations in food and soil along with the distribution of quantity of food ingested, and develop a simulation program that will produce a hypothetical set of trace element amounts in food and fecal samples for a specified study protocol. Second, we verify that the simulations produce results that are similar to those reported in previous mass-balance soil ingestion studies. We do so by comparing the distributions resulting from the simulations with distributions from actual mass-balance
danc17.doc 12/31/99
26
studies, and by evaluating whether or not with an adequate study design, the simulation will re-produce a hypothetical soil ingestion distribution. Next, we specify six hypothetical soil ingestion distributions, using current estimates of children's soil ingestion to bound the soil ingestion distributions. The six distributions are given as:
1. All subjects ingest 100 mg of soil on all days. 2. All subjects ingest 50 mg of soil on all days. 3. All subjects have a mean soil ingestion of 100 mg/day, but daily soil ingestion is normally distributed, with a standard deviation of 50 mg/d. 4. All subjects have a mean soil ingestion of 50 mg/day, but daily soil ingestion is normally distributed, with a standard deviation of 75 mg/d. 5. Daily soil ingestion is normally distributed, with mean 18 mg/d, subject standard deviation of 56 mg/d, and daily variance of 108 mg/d (based on the average of estimates from mixed models fit to Al, Si, Y, and Zr from the Amherst and Anaconda studies (10, Table 7). The distribution is based on an average of the mean and variance component estimates from the Amherst and Anaconda studies, as developed in Stanek and Calabrese (1999,submitted). 6. An empirical soil ingestion distribution based on Amherst and Anaconda subjects (excluding the pica subject), with normally distributed daily variation in estimates based on standard deviation modelled from a simple linear regression on the mean (intercept=63.5, slope=0.834 Source:nkmz99p14.sas). Subject values were based on the average of the median daily estimate (assuming a 28hr lag) for Al, Si, Y, and Zr. The empirical distribution
danc17.doc 12/31/99
27
is based on data from the same studies using the common trace elements (except Ti), with no outlier criteria applied. Using these distributions, we describe the ability of a particular study design (based on a 4 day and 7 day design) and estimation strategy (based on an overall average per subject) to characterize the hypothetical soil ingestion distribution (separating uncertainty from variability). In this context, we also assess the sensitivity of the uncertainty estimates to various assumptions including ingested soil from a yard other than the subject's yard, and different absorption of trace elements. More details on the construction of the simulation are given in a manuscript in Appendix B.
Results of Simulation Analysis
The Impact of the Study Duration
The simulations illustrate that the duration of the study has a large impact on estimates of the cumulative soil ingestion distribution. For simplicity, we limit presentations to two soil ingestion distributions, noting that similar results occur for other distributions. For most comparisons, we focus on the results based on the soil ingestion distribution #6 (median: 26mg/d; 90%:106 mg/d; 95%:149 mg/d), since this distribution comes closest to the soil ingestion observed in previous studies, and overlaps the empirical soil ingestion distribution at upper percentiles of soil ingestion. For some comparison, we include the simpler soil ingestion distribution #4 (mean soil ingestion of 50 mg/d with a daily standard deviation of 75 mg/d).
danc17.doc 12/31/99
28
First, we consider the impact of study duration on the distribution of average daily soil ingestion by simulating soil ingestion distributions over different study duration times. Table 13 presents results that illustrate the effect of study duration on the estimated soil ingestion distribution based on the soil ingestion distributions #4 and 6. Each cumulative distribution is obtained from a soil ingestion study of the specified duration by tabulating the average soil ingestion per person for 5000 subjects. The results illustrate the impact of study duration on the resulting distribution of average soil ingestion.
danc17.doc 12/31/99
29
Table 13. Cumulative Distributions of Average Soil Ingestion Developed from Simulations using Various Soil Ingestion Distributions for 4 and 7 days studies. Soil Ingestion Distribution #41
#41
#62
#62
Trace Element AL
Subjects Days 5000 4 5000 7 true soil ing.
Mean 61.1 61.4 61.4
Std 53.3 33.5 3.1
P01 -80.5 -21.6 54.0
P05 -12.5 14.9 56.2
P10 8.7 26.2 57.4
P25 34.0 42.0 59.3
P50 58.7 60.0 61.4
P75 86.7 80.2 63.5
P90 116.5 99.2 65.4
P95 141.1 113.4 66.6
P99 216.5 149.7 68.7
SI
5000 4 5000 7 true soil ing.
61.5 61.4 61.4
38.4 26.3 3.1
-21.9 2.8 54.2
2.9 21.1 56.3
16.6 29.6 57.4
36.5 43.4 59.3
59.8 60.2 61.4
84.5 78.0 63.5
108.9 95.1 65.5
126.7 106.2 66.5
162.2 127.4 68.6
AL
5000 4 5000 7 true soil ing.
55.3 55.8
70.0 56.8
-82.9 -44.6
-21.8 -4.1
-4.7 6.4
15.3 21.3
41.0 41.7
81.1 76.5
139.5 126.0
185.2 165.8
298.5 257.8
53.8
41.4
21.6
23.2
24.1
26.1
36.0
67.0
105.9
149.0
194.0
5000 5000
55.1
60.9
-34.5
-8.4
0.5
16.2
40.8
76.2
130.3
172.6
276.4
55.9
53.3
-13.1
2.7
9.5
22.3
41.0
73.2
120.2
163.4
254.4
53.8
41.4
21.6
23.2
24.1
26.1
36.0
66.9
149.1
194.2
SI
4 7
true soil ing.
105.7
1
All subjects have a mean soil ingestion of 50 mg/day, but daily soil ingestion is normally distributed, with a standard deviation of 75 mg/d. (source:nk99p56.sas) 2
An empirical soil ingestion distribution based on Amherst and Anaconda subjects (excluding the pica subject), with normally distributed daily variation in estimates based on standard deviation modelled from a simple linear regression on the mean (intercept=63.5, slope=0.834). (source:nk99p52.sas)
danc17.doc 12/31/99
31
For the soil ingestion distribution #4, although the true long run soil ingestion for each subject is 50 mg/d, the upper percentiles of the average soil ingestion distribution are much higher, particularly for studies of length four or seven days. Using the four or seven day averages to estimate the 95% of the long run soil ingestion distribution will over estimate the true 95% long run soil ingestion. Similar, but less dramatic results are evident using soil ingestion distribution #6. The results illustrate that short study designs with the soil ingestion distribution estimated by the average soil ingestion will over estimate the upper tails of the distribution (and underestimate the lower tails of the distribution). We summarize the expected impact on estimates of percentiles of these distributions in Table 13 for Al and Si. Focusing on the 95th percentile estimate in a four day study where the true long run soil ingestion was 50 mg/d (daily standard deviation 75 mg/d), the 95% estimate will be positively biased by 112% for Al, and 91% for Si. For the same true long run soil ingestion distribution in a seven day study, the 95% estimate will be positively biased by 70% for Al, and 60% for Si. Smaller biases occur when the results are tabulated based on the true long run soil ingestion distribution #6. In a four day study, the 95% estimate will be positively biased by 24% for Al, and 16% for Si. For the same true long run soil ingestion distribution in a seven day study, the 95% estimate will be positively biased by 11% for Al, and 10% for Si. The bias due to an increased spread of the estimated soil ingestion distribution can be summarized by comparing estimates of the between subject variance with the true between subject variance when the estimated variance is based on average soil ingestion estimates from different duration study designs. Using the soil ingestion distribution #6 and a 4 day study, the variance between subjects will be overestimated by 186% and 116% in a four day study based
danc17.doc 12/31/99
32
on Al and Si, respectively, and by 88% and 66% in a seven day study based on Al and Si, respectively.
The Impact of Ingestion of Soil from Neighbor's Yards and/or Absorption of Trace Elements from Food
Mass balance soil ingestion studies have based soil ingestion estimates on trace element concentrations in soil collected from play areas identified by children's guardians. It is possible that soil ingested by a child may not have come from these identified areas, but rather from neighbor's or friend's yards. We evaluate the potential impact of using the wrong trace element soil concentration by simulating soil ingestion for a child based on a specified soil concentration, but selecting an independent trace element concentration in soil (to represent the miss-specified play area) for estimation of soil ingestion. We illustrate the results of such miss-specification for a 7-day study design, where the true soil ingestion is given by distribution #6. The impact is assessed by tabulating the distribution of the average 7-day soil ingestion estimates for Al and Si from a simulated sample of 5000 subjects (Table 14), and comparing the estimated distribution with the similar distribution where the trace element concentration in soil is correct.
danc17.doc 12/31/99
33
Table 14. Cumulative Distributions of Average Soil Ingestion Developed from Simulations using Soil Ingestion Distribution #61, taking into account absorption and ingesting different soil. Trace Element AL
SI
Aborption none none none 30% food 30% food none
Soil Subjects Days same 5000 7 different 5000 7 constant 5000 7 same 5000 7 different 5000 7 same true soil
Mean 55.8 54.7 55.9 36.2 36.1 53.8
Std 56.8 58.8 57.7 59.4 60.6 41.4
none none none 30% food 30% food none
same different constant same different same
55.9 55.1 55.7 38.1 37.1 53.8
53.3 54.2 53.7 53.6 54.1 41.4
5000 5000 5000 5000 5000 true soil
7 7 7 7 7
1
P01 P05 P10 -44.6 -4.1 6.4 -37.8 -3.5 5.5 -43.4 -4.0 6.2 -92.1 -36.0 -17.5 -91.0 -36.5 -15.5 21.6 23.2 24.1
-13.1 2.7 -11.5 2.2 -11.8 2.5 -40.4 -16.8 -37.7 -17.7 21.6 23.2
9.5 9.4 9.2 -8.6 -9.1 24.1
P25 21.3 19.9 20.7 5.0 4.5 26.1
22.3 21.2 22.0 5.0 4.3 26.1
P50 41.7 41.0 41.4 26.2 25.1 36.0
41.0 39.8 40.8 24.3 22.7 36.0
P75 P90 P95 76.5 126.0 165.8 73.0 123.9 162.3 76.6 127.7 168.0 57.9 106.5 146.3 55.9 106.0 143.8 67.0 105.9 149.0
73.2 71.7 73.8 55.9 53.9 66.9
120.2 119.7 121.5 103.2 101.9 105.7
163.4 162.0 159.4 144.7 143.9 149.1
P99 257.8 272.4 261.9 233.7 248.0 194.0
254.4 249.8 256.0 236.8 228.2 194.2
An empirical soil ingestion distribution based on Amherst and Anaconda subjects (excluding the pica subject), with normally distributed daily variation in estimates based on standard deviation modelled from a simple linear regression on the mean (intercept=63.5, slope=0.834). (source:nk99p52.sas) Sources: Row 1: From second row in Table 6 (for Al) of nkmz3.doc. Row 2: From al81.txt created by nkmz99p81.sas. Row 3: From al83.txt created by nkmz99p83.sas. Row 4: From al80.txt created by nkmz99p80.sas. Row 5: From al82.txt created by nkmz99p82.sas. Row 6: From seventh row in Table 6 (for Al) of nkmz3.doc. Row 7: From second row in Table 6 (for Si) of nkmz3.doc. Row 8: From si81.txt created by nkmz99p81.sas. Row 9: From si83.txt created by nkmz99p83.sas. Row 9: From second row of Table 6 (for Si, 30% food absorption) of nkmz3.doc Row 10: From si82.txt created by nkmz99p82.sas Row 11: From seventh row in Table 6 (for Si) of nkmz3.doc.
danc17.doc 12/31/99
34
The first two rows of Table 14 for Al and Si illustrate the impact of miss-specification of the trace element concentration in soil. For either trace element, there is minimal impact on the estimated soil ingestion distribution due to miss-specification of soil concentrations. Upper percentiles of the distribution are similar, as are the mean and standard deviations. Since the estimated soil ingestion distribution appeared to be insensitive to miss-specification of the child's play area, we examined the impact on the soil ingestion distribution if a single trace element soil concentration was used for each child. These results are given in the third row of Table 14 for Al and Si, and illustrate that the estimated soil ingestion distribution is insensitive to use of a common soil concentration. Future mass-balance soil ingestion studies may take advantage of this result by collecting fewer soil samples to determine trace element soil concentration. The results in Table 14 also describe the impact of absorption of trace elements from food. Although trace elements selected for use in mass-balance studies are thought to have low absorption, only limited studies have been conducted that characterize the actual absorption. If trace elements ingested in food are absorbed and the absorption is not accounted for when estimating soil ingestion, the soil ingestion distribution will be under-estimated. We evaluate the impact of food absorption by assuming 30% of each trace element in food is absorbed, and examining the impact on the distribution of seven day average soil ingestion. Such results are given in rows 4 and 5 of Table 14 for each element. In the simulations, the average soil equivalent amount of Al in food is 64 mg/d, while the average soil equivalent amount of Si in food is 58 mg/d. The assumption of 30% absorption from food will reduce the soil equivalent amounts from food in fecal samples by 19 mg/d and 17 mg/d for Al and Si, respectively. These values correspond closely to the observed bias of the soil ingestion distribution in row 4 of the panels for Al and Si in Table 14. Row 5 in Table 14
danc17.doc 12/31/99|
35
presents similar results when there is absorption of food, and the play area is miss-specified for the soil sample. Once again, since the impact of miss-specifying the play area is negligible, the results are similar to those in row 14. Finally, note that the last row in each panel of Table 14 presents the actual distribution of long term soil ingestion (calculated by simulating soil ingestion for 365 days on each of 5000 subjects). These results are identical to the last rows in each of the last two panels of Table 13. It is this distribution that we attempt to estimate in a soil ingestion study. In summary, Table 14 illustrates that the most important impact on estimates of the distribution of soil ingestion is the duration of the study design. There is little impact on the distribution of miss-specifying the child's play area. Food absorption of 30% will bias the distribution estimates down by less than 20 mg/d.
Sensitivity of Soil Ingestion Estimates to Transit Time Assumptions
We examine the sensitivity of soil ingestion estimates based on a 28 hour transit time assumption by comparing the estimates with estimates constructed assuming a six or 48 hour transit time assumption. For the six and 48 hour transit time assumptions, we calculate daily soil ingestion estimates for children in the Anaconda study, and compare the distributions of the resulting estimates. Note that for the three trace elements (Ce, La, and Nd) that were added in the Anaconda study, the food/soil ratios were anticipated to be low (and hence insensitive to transit time assumptions. This assumption proved to be true. The median food/soil ratios for each of these elements (on a daily basis) was less than 25 mg/d, implying that errors in transit time assumptions would have only a small impact on variability in daily soil ingestion estimates.
danc17.doc 12/31/99|
36
Daily soil ingestion estimates using a 6 hour lag time are calculated by subtracting 1/4 of the soil equivalent for food for the previous day, and 3/4 of the soil equivalent for food for the current from the soil equivalent for fecal on the current day. Daily soil ingestion estimates using a 48 hour lag time are calculated by subtracting the soil equivalent for food for the day two days previous to the current day from the soil equivalent for fecal on the current day. soil ingestion estimates were created for the 64 children in the Anaconda study with the varying lag times. Daily soil ingestion estimates were created for each of eight elements (Al, Ce, La, Nd, Si, Ti, Y, Zr) for each subject-day, by forming the difference between total food intake and fecal output and dividing the difference by the concentration of the trace element. We used the trace element soil concentration estimates in