Statistics in Medicine
Research Article Received XXXX
(www.interscience.wiley.com) DOI: 10.1002/sim.0000
F tests with random samples size. Theory and applications Elsa E. Moreiraa ∗ , Jo˜ao T. Mexiaa and Christoph E. Minderb When the dimensions n1 , ..., nk of the samples for the k levels of the one-way layout cannot be considered as known, it is more correct to consider these, as realizations of independent poisson random variables with parameters λ1 , ..., λk . This situation arises mostly when there is a given time span for collecting the observations. A good example is the collecting data during a given time span for the comparison of pathologies. The number of patients that arrive to an hospital with different pathologies during that period, is not known in advance. The data is obtained from the patients with each pathology as soon as they present themselves. Once the data is in hand, the F test statistics with conditional F distribution given the number of observations, can be obtained. This approach can also be used while planning study, in a design phase. Given a predefined test power 1 − β for α level test, the minimum sample size and correspondent minimum duration of the collection period can be found. The application c 2010 John of F tests with random sample sizes is further demonstrated through a practical example. Copyright Wiley & Sons, Ltd. Keywords: ANOVA; F distribution; Poisson distribution; Power analysis; Random sample sizes
1. Introduction The theoretical developments presented in this paper were motivated by a real case situation in the field of medicine. Consider the numbers of patients that arrive to an hospital with different pathologies during a given time span. These numbers cannot be known in advance because, if we decide to do the counting in a different time period of the same length, the number of patients obtained with those pathologies will be different from the first counting. So, if, due to limitations in the budget, we have to conduct just one study to compare the pathologies, it is more correct to consider the sample dimensions as realizations of random variables. The data will be collected from the patients with each pathology as soon as they present themselves. This situation arises mostly because there is a given time span for collecting the observations. After the elapsed time interval, the F test statistics can be obtained to test hypotheses about the different means values. In what follows, it is assumed that the sample dimensions n1 , ..., nk for the k pathologies are realizations of independent Poisson variables with parameters λ1 , ..., λk and the observations in these sample are normal, independent with variance σ 2 . As a result, the F statistic to test the null hypotheses H0 : µ1 = ... = µk will have conditional F distribution on the number of observations as will be seen in the next section. The approach presented is also useful while planning studies, before the data are in. The minimum data collection duration to ensure, with a given probability, reliable results will be obtained. Thus, when all k pathologies are covered, the conditional distribution of the F test statistic has k − 1 and n − k degrees of freedom, with n the sum of all ni , i = 1, ..., k and non-centrality parameter δ , which is null when the mean values µ1 , ..., µk for different pathologies are equal. Thus, δ will measure the “distance” of the alternatives from H0 . So, we may look for the minimum duration that ensures, with a given probability, that
a
CMA - Research center in Mathematics and Applications, Faculty of Sciences and Technology, Nova University of Lisbon Horten Zentum, Medical Faculty, University of Zurich ∗ Correspondence to: CMA - Research center in Mathematics and Applications, Faculty of Sciences and Technology, Nova University of Lisbon, 2829-516 Caparica, Portugal. E-mail:
[email protected] b
Statist. Med. 2010, 00 1–?? Prepared using simauth.cls [Version: 2010/03/10 v3.00]
c 2010 John Wiley & Sons, Ltd. Copyright
Statistics in Medicine
E. E. MOREIRA, J. T. MEXIA, C. E. MINDER
• all pathologies are covered, i.e.,there is at least one observation for each pathology; • the conditional power of the α level test for a given δ is sufficient high.
In sub-section 3.1, we see how to determine n˙ to have a sufficient powerful F test, while in sub-section 3.2, the minimum time span for, with a probability 1 − β , having ni > 0, i = 1, ..., k and n ≥ n˙ will be obtained. Finally, an application with simulated data is presented in section 5 to illustrate the methodology.
2. Conditional F distributions Let the components of the vector n = (n1 , ..., nk ) of the sample sizes be realizations of the components of the vector N = (N1 , ..., Nk ). The components of N are independent Poisson variables, with vector of parameters λ = (λ1 , ..., λk ). Assuming that, when n > 0, the samples xi,1 , ..., xi,ni , i = 1, ..., k are normal with mean values µ1 , ..., µk and variance σ 2 , for testing H0 : µ1 = ... = µk the F test statistic Pk T 2 T 2 n − k i=1 nii − n F= Pk k−1 i=1 Si Pni 2 Pni P Pk k is used, where Ti = j=1 xi,j , Si = j=1 xi,j − Ti2 /ni , T = i=1 Ti and n = i=1 ni . Given our assumptions, this statistics will have, given N = n, F distribution with k − 1 and n − k degrees of freedom and non-centrality parameter k 1 X δ= 2 ni (µi − µ.)2 σ i=1
where
1X n i µi n k
µ. =
i=1
is the general mean value. In a previous paper, [1], the unconditional distribution for the F statistics was derived. This distribution is given by the following series X q(n)F (z|k − 1, n − k, δ(n)) (1) F˙ (z) = n>0
whose terms corresponds to the vectors n = (n1 , ..., nk ), with q(n) =
k Y e−λi λni
ni
i!
i=1
1 − e−λi
.
However, when the null hypothesis H0 : µ1 = ... = µk holds, δ(n) = 0, ∀n > 0 and X q(n)F (z|k − 1, n − k). F ∼ F˙ 0 (z) = n>0
The truncation error of the series (1) when restricting ourselves to samples with n ≤ no was studied in a previous paper [1]. These truncation errors are perfectly controlled even if the sample dimensions are relatively small, i.e., the truncation errors will be bounded by kε b(no ) < (1 − e−λo )k if noi is chosen such that no i X λni e−λi i > 1 − ε, i = 1, ..., k, (2) ni ! n =0 i
with λo = M in{λ1 , ..., λk } [1]. For instance, if the minimum of the λi , i = 1, ..., k is λ0 = 1 and for ε = 10−4 , the individual sample dimensions should not be less than 6 to have the truncation error controlled [1]. Theses formulas enable us to solve the equation F˙0 (z) = 1 − α for z , to obtain the (1 − α)-th quantiles, the critical values of α level tests. After obtaining the quantiles for the distribution F˙0 (z), the inference for the one-way ANOVA to compare the k pathologies proceeds in the usual way [2, 3]. Moreover, this approach extends directly to fixed effects orthogonal ANOVA with more than one factor [4, 5]. 2 www.sim.org Prepared using simauth.cls
c 2010 John Wiley & Sons, Ltd. Copyright
Statist. Med. 2010, 00 1–??
Statistics in Medicine
E. E. MOREIRA, J. T. MEXIA, C. E. MINDER
3. Sample size analysis Sample size analysis using the statistical power of a hypothesis test, is an important tool for deciding what sample size is required to guarantee a reasonable chance to reject a false null hypothesis. In the next two sub-sections, the steps to be followed are indicated. 3.1. Obtaining the global sample dimension We are working with F tests for fixed effect models with r and s degrees of freedom for which, a null hypotheses H0 and an alternative one Ha were previously formulated. The power function of this tests is given by P owα (δ) = P r(F > f1−α,r,s |H0 false)) = 1 − F (f1−α,r,s |r, s, δ) = 1 − β
where f1−α,r,s is the (1 − α)-th quantile of a central F distribution with r and s degrees of freedom. Now, we could want to know what is the minimum value s = s(r, δ, α, β) to ensure that F (f1−α,r,s |r, s, δ) = β,
(3)
that is, for having a power 1 − β for the α-level F test with non-centrality parameter δ and r degrees of freedom on the numerator. In order to compute the non-central F distribution values the formulas presented in the Appendix are useful. With k the number of treatments and n the sum of the samples dimensions for all treatments, we have r = k − 1 and s = n − k . So, from (3), using a numerical method to solve, in order to s, the equation δ
e− 2
∞ X ( δ )l 2
l=0
l!
F(
r f1−α,r,s |r + 2l, s) = β r + 2l
the n˙ = s + k may be obtained. The minimum global sample size n˙ guarantees that, for all alternatives with non-centrality parameter not less than δ , the test power is at least 1 − β . These will be the selected alternatives. Notice that, in (3), the distribution F and not the F˙ was used, because our interest was only in the total sample size n and not in the individual sample sizes. Also, note that the parameter δ measures the ”distance” between the alternatives and the tested hypothesis. Thus, we looked for the minimum global sample size that, for a given δ , ensured a sufficient power for the test. This led us to work with the F that, under H0 , depends directly only on the global sample size and not on the ni , i = 1, ..., k , thus lightning the calculations. 3.2. Obtaining the minimum duration by sampling In the last section, the minimum global sample dimension n˙ to get an F test with power larger than or equal to 1 − β was obtained for the selected alternatives. Now, we may want to know how much time is necessary to wait in order to get this minimum number of observations and for all pathologies to be covered. Assuming the existence of k independent Poisson counting processes Ni (t); t ≥ 0, i = 1, ..., k , corresponding to the arrivals of the patients with the k pathologies, and reliable lower bounds λ˙ 1 , ..., λ˙ k for the processes intensities, the probability of getting at least one patient per pathology is pr(N(t) > 0) =
k Y
pr(Ni (t) > 0) =
i=1
k Y
[1 − pr(Ni (t) = 0)] >
i=1
k Y
˙
[1 − e−λi t ], i = 1, ..., k.
(4)
i=1
Focusing on the sum process N (t) =
k X
Ni (t), t ≥ 0,
i=1
the corresponding intensity will have lower bound λ˙ =
k X
λ˙ i ,
I=1
so, pr(N (t) ≥ n) ˙ = 1 − pr(N (t) < n) ˙ ≥1−
n−1 ˙ X j=1
Statist. Med. 2010, 00 1–?? Prepared using simauth.cls
˙
e−λt
c 2010 John Wiley & Sons, Ltd. Copyright
˙ j (λt) . j! www.sim.org
3
Statistics in Medicine
E. E. MOREIRA, J. T. MEXIA, C. E. MINDER Table 1. Minimum individual sample sizes pathology
1
2
3
λi ni min.
8.56 26
0.64 7
0.34 6
Then, on account of Boole/Bonferroni’s inequality, the probability of getting a global sample size lager than or equal to n˙ and individual sample sizes different from zero is given by pr[(N(t) > 0) ∩ (N (t) ≥ n)] ˙ ≥ pr(N(t) > 0) + pr(N (t) ≥ n) ˙ −1>
k Y
˙
(1 − e−λi t ) −
i=1
n−1 ˙ X j=1
˙
e−λt
˙ j (λt) . j!
Now, setting a probability p to have N(t) > 0 and N (t) ≥ n˙ , the time t can be chosen such that k Y i=1
˙
(1 − e−λi t ) −
n−1 X j=1
˙
e−λt
˙ j (λt) > p. j!
(5)
Notice that when the λi are very small, negative values can be obtained for expression (5). In these cases a very high number of time units t are required to obtain, with an high probability p, both N(t) > 0 and N (t) ≥ n˙ .
4. Example of application In order to illustrate the usefulness of the method, in this section is presented an example of application with simulated data. Consider the people arriving at an hospital with pathologies 1, 2 and 3 during a given time span, from which a blood sample is taken to register a specific value. Suppose that is known that the arrivals rates for the 3 pathologies are λ1 = 8.56, λ2 = 0.64 and λ3 = 0.34 cases per day, in that hospital. Before moving to data collection, we may want to answer to the following questions: 1. What is the minimum individual sample sizes required to have a controlled (small) truncation error for the series in the unconditional distribution (1) of the F statistics? 2. What is the minimum global sample size n˙ needed to obtain a pre-fixed test power of 80%? 3. How long is required to wait to obtain this minimum global sample size? In the next section the answers to these questions will be given. After that, we can proceed with the analysis of the collected data. 4.1. Planning the study Taking ε = 10−6 , the inequality (2) is used to computed the individual minimum sample sizes in order to answer to question 1 and present them in Table 1. If these minimum sample sizes are satisfied, we have the guarantee that the truncation error will have an upper bound of 0.0001230. Our aim is to perform an F test for a one-way ANOVA to compare k = 3 different pathologies with α = 0.05% of significance. The F statistics will then have 2 and n − 3 degrees of freedom. Letting n increasing from 4 to 100, expression (3) was computed in a loop where δ , starting at zero, also increases until F (f0.95,2,n−3 |2, n − 3, δ) ≤ 0.2, that is, the loop stops when the power ≥ 80%. In Table 2 is presented for each value of n ≥ 4, the smallest value of δ for which the power of the F test is greater or equal to 80%. Thus, from the Table 1, we know that with a total sample size of 39 = 26 + 7 + 6 or more the truncation error of the unconditional F distribution is small enough to use the quantiles of this distribution as critical values in the F test of hypothesis. Beside this, from Table 2, we can see that with a total sample size of 39 or more, the corresponding power of the test will be 80% or more for a non-centrality parameter δ ≥ 11. Therefore, the answer to question 2 is that the minimum global sample size for our future study should be 39. Suppose now that we want to have high probability of getting n ≥ 39 and ni > 0, ∀i = 1, 2, 3. Computing the probability given by expression (5) for t increasing from 1 to 100 and stopping when pr(n ≥ 39 ∩ n1 > 0 ∩ n2 > 0 ∩ n3 > 0) > 1 − ε for some ε, it was found that for t˙ ≃ 30 the value for that probability was 0.99997. In other words, the data collection period should be at least 30 days in order to build a test with a good power of 80%. 4 www.sim.org Prepared using simauth.cls
c 2010 John Wiley & Sons, Ltd. Copyright
Statist. Med. 2010, 00 1–??
Statistics in Medicine
E. E. MOREIRA, J. T. MEXIA, C. E. MINDER Table 2. Pairs of n and δ that allow to have a test power ≥ 80% n δ
4 81
5 63
6 32
7 23
8 19
9 17
10 16
11 15
12 14
13 14
14 13
15 13
16 13
17 13
18 12
... 12
26 12
27 11
... 11
84 11
85 10
... 10
100 10
Table 3. Observations ni
pathology 1
pathology 2
pathology 3
1.12 0.06 1.09 0.27 1.58 0.40 0.03 0.36 1.03 1.86 1.68 0.45 1.59 0.81 0.71 0.44 1.33 0.70 0.29 1.26 0.06 1.30 1.23 0.74 1.78 1.71 0.9179
1.96 1.83 0,88 1.62 1.98 1.78 1.66 1.80 1.99 0.35
0.65 0.46 0.67 0.49 0.85 0.64
1.5833
0.6266
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 mean
Table 4. One way ANOVA table ANOVA
df
SS
F statistic
δ
Treatments Error Total
2 39 41
4.3652 11.3263 15.6915
7.5154
11.1999
4.2. Data analysis A study was conducted in that hospital with a collection period of 30 days. The 42 observations collected are presented in Table 3. In Table 4 is presented the P one way ANOVA used to test the null hypothesis H0 : µ1 = µ2 = µ3 . The distribution F˙0 (z) = n>0 q(n)F (z|2, 39) of the F statistic was computed and the corresponding series was truncated, that is, the terms such n1 > 26 ∧ n2 > 10 ∧ n3 > 6 were not considered. This condition guaranties that the truncation error is less than 0.0001230. Using a numerical method the equation F˙0 (z) = 0.95 was solved in order to z and obtained the quantile f˙(0.95;2;39) , which is the critical value of the 0.05 level test. Since the obtained critical value is 4.8586 < 7.5154, the null hypothesis is rejected and we can concluded that the 3 pathologies are significantly different. Moreover, since δ = 11.199 and n = 42, according to the Table 2, a test power ≥ 80% is guaranteed. Had we considered the sample sizes n1 , n2 , n3 as fixed numbers rather than as Poisson random variables, the critical value for this test with the usual central F distribution would be 3.2381 instead of 4.8586. As a result, when using F tests with random sample size an higher value for the F statistic is required in order to protect us from the randomness of the Statist. Med. 2010, 00 1–?? Prepared using simauth.cls
c 2010 John Wiley & Sons, Ltd. Copyright
www.sim.org
5
Statistics in Medicine
E. E. MOREIRA, J. T. MEXIA, C. E. MINDER
sample sizes.
5. Conclusions In a comparison study, when there is a specific time span to collect the observations and the sizes of the samples cannot be decided previously, it is appropriate to consider the sample dimensions as realizations of independent Poisson variables. In this case, the use of a larger critical value obtained from the distribution F˙ it is mandatory, to protect against the randomness of the sample size. Furthermore, the approach presented in the planning phase of a study can be used to obtain the minimum duration for data collection that ensures the chosen power and level of significance.
Acknowledgement This work was partially supported by CMA/FCT/UNL, under the project PEst-OE/MAT/UI0297/2011.
Appendix Let F¯ (z|r, s, δ) =
χ2r,δ χ2s
be the distribution of the variable T = rs F . Easily, we can see that
and that
s F¯ (z|r, s, δ) = F ( z|r, s, δ) r
(6)
r F (z|r, s, δ) = F¯ ( z|r, s, δ) s
(7)
[1]. Also, see [6] F¯ (z|r, s, δ) = e−δ/2
∞ X ( δ )l 2
l=0
l!
F¯ (z|r + 2l, s)
then, using (7) and (6), we get δ
F (z|r, s, δ) = e− 2
∞ X ( δ )l
∞ X ( δ )l
l=0
l=0
δ r F¯ ( z|r + 2l, s) = e− 2 l! s
2
2
l!
F(
r z|r + 2l, s) r + 2l
[1], which can be use to compute the non-central F distributions if there is no software available.
References 1. Moreira E.E. and Mexia, J.T. Randomized sample size F tests for the one-way layout. AIP Conference Proceedings, volume 1281- ICNAAM 2010- 8th International Conference of Numerical Analysis and Applied Mathematics, 2010, 1248–1251. 2. Scheff´e H. The Analysis of Variance. John Willey & Sons. New York, 1959. 3. Montgomery D.C. Design and Analysis of Experiments - 5th edition. John Willey & Sons. New York, 1997. 4. Nunes C., Ferreira D., Ferreira S., Mexia J.T. F-tests with a rare pathology, Journal of Applied Statistics, 2011. DOI:10.1080/02664763.2011.603293. 5. Mexia J.T., Nunes C., Ferreira D., Ferreira S., Moreira E. Ortogonal fixed efects ANOVA with random sample sizes. WSEAS proceedings of the 5th International conference on Applied Mathematics, Simulation, Modelling, 2011, 84–90. 6. Kendall M. and Stuart A. The advanced theory of statistics. Vol.II. Charles Griffin & Co. Londres, 1961.
6 www.sim.org Prepared using simauth.cls
c 2010 John Wiley & Sons, Ltd. Copyright
Statist. Med. 2010, 00 1–??