Sonographic estimation of fetal weight - Wiley Online Library

13 downloads 3596 Views 102KB Size Report
Jun 8, 2007 - ultrasound assessment of fetal weight are least accurate at the extremes of birth ... involves summation of femur length (FL), abdominal diameter (AD) and ..... from the 'training set' may not be appropriate for other datasets.
Ultrasound Obstet Gynecol 2007; 30: 173–179 Published online 8 June 2007 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/uog.4037

Sonographic estimation of fetal weight: comparison of bias, precision and consistency using 12 different formulae N. G. ANDERSON*, I. J. JOLLEY† and J. E. WELLS‡ *Radiology Department, Christchurch Hospital, ‡Department of Public Health and General Practice, Christchurch School of Medicine and Health Sciences, University of Otago, Christchurch, New Zealand and †Department of Radiology, Royal Hallamshire Hospital, Sheffield, UK

K E Y W O R D S: bias (epidemiology); birth weight; error sources; fetal weight; prenatal; ultrasonography

ABSTRACT Objectives To determine the major sources of error in ultrasonographic assessment of fetal weight and whether they have changed over the last decade. Methods We performed a prospective observational study in 1991 and again in 2000 of a mixed-risk pregnancy population, estimating fetal weight within 7 days of delivery. In 1991, the Rose and McCallum formula was used for 72 deliveries. Inter- and intraobserver agreement was assessed within this group. Bland–Altman measures of agreement from log data were calculated as ratios. We repeated the study in 2000 in 208 consecutive deliveries, comparing predicted and actual weights for 12 published equations using Bland–Altman and percentage error methods. We compared bias (mean percentage error), precision (SD percentage error), and their consistency across the weight ranges. Results 95% limits of agreement ranged from −4.4% to + 3.3% for inter- and intraobserver estimates, but were −18.0% to 24.0% for estimated and actual birth weight. There was no improvement in accuracy between 1991 and 2000. In 2000 only six of the 12 published formulae had overall bias within 7% and precision within 15%. There was greater bias and poorer precision in nearly all equations if the birth weight was < 1000 g. Conclusions Observer error is a relatively minor component of the error in estimating fetal weight; error due to the equation is a larger source of error. Improvements in ultrasound technology have not improved the accuracy of estimating fetal weight. Comparison of methods of estimating fetal weight requires statistical methods that can separate out bias, precision and consistency. Estimating fetal weight in the very low birth weight infant is subject

to much greater error than it is in larger babies. Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

INTRODUCTION Ultrasound has been used to estimate fetal weight for over 30 years. Accurate estimation of fetal weight is most important when dealing with high-risk populations, particularly when dealing with the two extremes of birth weight. Estimated fetal weight is taken into consideration when making clinical decisions involving induction or delay of labor and method of delivery1,2 . Clinicians are becoming increasingly reliant on imaging, largely as a consequence of advances that have been made in ultrasound technology. There is a danger that the advances in some fields of obstetric ultrasound (depiction of fetal anomaly) will be extrapolated to imply an improved accuracy in others. From its inception, ultrasound estimation of fetal weight has been presumed to be more accurate than clinical methods. The fundamental underlying presumption is that the sonographic measurements of multiple linear and planar dimensions of the fetus provide sufficient information to allow for accurate algorithmic reconstruction of the three-dimensional fetal volume of varying tissue density2 . Sonographic assessment in many circumstances is no more accurate than clinical palpation in assessing fetal weight3 – 5 . Both clinical palpation and ultrasound assessment of fetal weight are least accurate at the extremes of birth weight6 . The inherent inaccuracy of two-dimensional fetal weight assessment is due to many variables, many of which are beyond our control, such as difficulties in obtaining accurate measurements of fetal parts, due to maternal obesity, anterior placement of the placenta, and oligohydramnios.

Correspondence to: Dr N. G. Anderson, Radiology Department, Christchurch Hospital, Private Bag 4710, Christchurch, New Zealand (e-mail: [email protected]) Accepted: 24 January 2007

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

ORIGINAL PAPER

Anderson et al.

174

In this study, we have paid careful attention to statistical methods of analysis. We have followed the advice of De Vet7 , who counsels that in order to assess the clinical relevance of error, ‘tracing the sources and types (bias or random error) of the disagreements is the beginning of wisdom. For that purpose, presenting one single coefficient is insufficient and visual presentation of the data is advisable’. The plethora of equations to estimate fetal weight attests to the general acknowledgment that there are inaccuracies in all methods. We have sought to compare different equations using methods that allow us to pinpoint the causes for inaccuracies. These include bias (the systematic over- or under-prediction of weight) and precision (random error). As bias and precision can vary across the range of birth weights from very small to very large, it is important to assess the consistency of bias and precision throughout the weight range. Previous publications comparing different formulae for estimating fetal weight can be divided into two groups: those that assess both bias and precision8 – 12 ; and another group in which a single measure, such as R2 from regression or the interclass correlation coefficient, is used6,13 . The aim of this study was to determine the major sources of error in sonographic assessment of fetal weight and any changes over the last decade. In this study we have assessed: (1) the reliability and accuracy of sonographic estimates of fetal weight; (2) the change in accuracy, if any, of estimated fetal weight between 1991 and 2000; (3) a comparison of agreement between actual and estimated fetal weight for 12 published sonographic methods14 – 23 (Table 1).

METHODS The 1991 study We undertook a prospective observational study. Over a 3-month period we ascertained the birth weight of

all infants born at our institution. In 1991, we used Acuson 128 XP (Siemens Acuson, Mountain View, CA, USA) and Toshiba SSA-77B ultrasound machines (Toshiba Corp, Tustin, CA, USA). We included in the study all women for whom a pregnancy scan, including estimation of fetal weight, had been performed within 7 days of birth. The nature of the inclusion criteria meant that our study sample, drawn from a mixedrisk population, included many high-risk pregnancies. Seventy-two consecutive deliveries in 1991 met these criteria. The estimated fetal weight was calculated using the Rose and McCallum formula, which the authors call FETHAL14 , and entered into a database. This formula involves summation of femur length (FL), abdominal diameter (AD) and biparietal diameter (BPD) to give a number in millimeters. The estimated fetal weight corresponding to this number is then read from a look-up chart derived from the regression equation: ln(BW) = 0.143(BPD + AD + FL) + 4.19814 . At least two hard-copy images of each of these parameters were obtained. Two sonographers independently reviewed the hard copy images, and re-recorded the most accurate measurements of the femur length, abdominal diameter and biparietal diameter to estimate fetal weight. In the 1991 dataset, 64 of the 72 babies had fetal weight estimated using the Rose and McCallum formula three times. The first time assessment was by a mixed group of sonographers (EFW1). Later, two of those sonographers both independently reassessed the images and re-estimated fetal weight (EFW2, EFW3). Gestational age at delivery was 24–42 weeks. Mean actual birth weight was 2895 (SD 903 (range 680–4400)) g; mean estimated fetal weight was 2807 (SD 912 (range 700–5200)) g. Four babies were under 1000 g. Interobserver agreement was assessed for all 64 babies (EFW2 vs. EFW3) by comparing the later estimates made by a sonographer with those they had made in the first set

Table 1 Twelve regression equations for estimating fetal weight Reference

Regression equation

Rose and McCallum14 Warsof et al.15 Shepard et al.16 Hadlock et al. A17 Hadlock et al. B18 Hadlock et al. C18

Ln(BW) = 0.143(BPD + AD + FL) + 4.198 Log10 (BW) = (0.144 × BPD) + (0.032 × AC) − (0.000111 × AC × BPD2 ) − 1.599 Log10 (BW) = (0.166 × BPD) + (0.046 × AC) − 0.002646 × AC × BPD − 1.7492 Log10 (BW) = 1.5662 − (0.0108 × HC) + (0.0468 × AC) + (0.171 × FL) + (0.00034 × HC2 ) − 0.003685(AC × FL) Log10 (BW) = 1.304 + (0.05281) × AC) + (0.1938 × FL) − (0.004 × AC × FL) Log10 (BW) = 1.3596 + (0.00061 × BPD × AC) + (0.424 × AC) + (0.174 × FL) + (0.0064 × HC) − (0.00386 × AC × FL) Log10 (BW) = 0.77125 + (0.13244 × AC) − (0.12996 × FL) − (0.00173588 × AC2 ) + (0.00309212 × FL × AC) + (2.18984 × FL/AC) Log10 (BW) = 1.13705 + (0.15549 × BPD) + (0.0464 × AC) − (0.00279682 × BPD × AC) + (0.037769 × FL) − (0.000494529 × AC × FL) BW = (BPD × AC × 9.337) − 229 Log10 (BW) = 1.6961 + (0.02253 × HC) + (0.01645 × AC) + (0.06439 × FL) Log10 (BW) = 1.6575 + (0.04035 × HC) + (0.01285 × AC) BW = AC3 × 0.0816

Ferrero et al.19 Woo et al.20 Thurnau et al.21 Weiner et al. A22 Weiner et al. B22 Higginbottom et al.23

AC, abdominal circumference; AD, abdominal diameter (equivalent to AC/π); BPD, biparietal diameter; BW, estimated birth weight; FL, femur length; HC, head circumference.

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

Ultrasound Obstet Gynecol 2007; 30: 173–179.

Estimated fetal weight

175

of estimates. Because differences tended to increase with average estimates, a log transformation was applied to each EFW, as recommended by Bland and Altman24 . Therefore, the measures of agreement obtained were ratios, rather than absolute values. These have been converted to percentage departures from agreement, for ease of interpretation.

The 2000 study We repeated the comparison of estimated fetal weight and actual weight in 2000 using Acuson 128 XP and Acuson Aspen (Siemens Acuson, Mountain View, CA, USA) ultrasound instruments. The study group in 2000 comprised 208 deliveries over a 12-month period that met our original inclusion criteria. However, the abdominal circumference (AC) and head circumference (HC) were also recorded. Fetal weight was again calculated using the Rose and McCallum formula14 (Table 2). Gestational age at delivery was 23–42 weeks. Mean actual birth weight was 2334 (SD 869 (range 400–4450)) g; mean estimated fetal weight was 2292 (SD 886 (range 400–4458)) g. Thirteen babies were under 1000 g. The ethnicity of our population was similar for the two cohorts: 72% Caucasian, 11% Maori and other Polynesian, and 17% other.

Comparison of the accuracy of Rose and McCallum fetal weight estimates in 1991 and 2000 Scans performed by a number of sonographers from all available deliveries were used from 1991 and 2000, using the Rose and McCallum formula14 estimates. The percentage error (100 × (actual weight − estimated weight)/estimated weight)) was calculated for comparison with the results from the Bland–Altman log analyses.

The two years were compared to determine if there had been any improvement over the decade that both bias and precision had been compared (see Table 2).

Comparison of different equations for estimating fetal weight in 2000 Only 192 of the 208 deliveries had all scan measurements recorded for all 12 equations (Table 1). We determined how well the predicted and actual weights agreed for 12 published equations14 – 23 (Table 1) using Bland–Altman24 and percentage error methods (Tables 2 and 3). Interpretation of the different regression equations for estimating fetal weight requires assessment of four criteria: bias (mean percentage error), precision (SD percentage error), consistency of bias across the weight ranges, and consistency of precision across the weight ranges. Bias and precision are the major factors that allow ranking and comparison. Consistency across the weight ranges is a secondary characteristic.

Statistical methods To assess the reliability of the Rose and McCallum formula14 , we used the Bland–Altman method24 for assessing agreement. This indicates systematic disagreement (mean difference) and random disagreement (limits of agreement). Also, Bland–Altman plots indicate whether differences increase with means; such a pattern indicates the need for log transformations. For investigation of the accuracy of estimates of fetal weight, Bland–Altman plots were again used. In addition, the percentage error was calculated. Both methods show bias (systematic over- or under-estimation) and precision (SD of the difference between estimated and actual weight). Ideally, the estimated weight should be precise Table 3 Measures of agreement and limits of agreement between actual and estimated birth weights, expressed as percentages

Table 2 Ratio measures of agreement and limits of agreement from Bland–Altman plots of log(EFW), expressed as percentages* in estimates for 64 deliveries†

Source

n

Bias (%)

Limits of agreement (%)‡

Interobserver EFW2/EFW3

64

0.0

−3.4, 3.3

Intraobserver EFW1/EFW2 EFW1/EFW3

12 22

−1.2 −1.0

−3.9, 1.6 −4.4, 2.1

Bland–Altman results (logs)*

Percentage error†

Source

n

Bias

Limits of agreement§

Mean

± 2 SD

1991 ABW/EFW2 1991 ABW/EFW3

64 64

1.6 1.5

−17.9, 25.7 −18.3, 26.0

2.2 2.1

−19.4, 23.8 −20.0, 24.1

1991 ABW/EFW‡ 1991 ABW/EFW‡

64 72

2.4 3.0

−16.7, 25.8 −16.5, 27.1

2.9 3.1

−18.0, 23.8 −16.6, 27.4

2000 ABW/EFW‡ 208

3.0

−18.0, 24.0

3.1

−18.0, 24.0

*Ratios obtained after back-transformation of differences in logs were converted to percent departure from a ratio of 1.0 (Bland–Altman24 ). †Mean EFW = 2807 (range, 700–5200) g. ‡Mean difference ± 2 SD of the difference. EFW, estimated fetal weight using Rose and McCallum formula14 ; EFW1, estimated fetal weight at time of scan by any of seven sonographers; EFW2, estimated fetal weight by one sonographer reviewing the original images; EFW3, estimated fetal weight by another sonographer reviewing the original images.

*The ratios obtained after back-transformation of differences in logs were converted to percent departure from a ratio of 1.0 (Bland–Altman24 ). †Percentage error = 100 × (ABW − EFW)/ EFW. ‡Estimates made by a number of different sonographers. §Mean difference ± 2 SD of the difference. ABW, actual birth weight; EFW, estimated fetal weight using Rose and McCallum formula14 . ln (ABW) = 0.143(BPD + AD + FL) + 4.198, where AD is abdominal diameter, BPD is biparietal diameter, and FL is femur length.

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

Ultrasound Obstet Gynecol 2007; 30: 173–179.

Anderson et al.

176

Bias (mean % error in weight estimate)

Weiner B

20

Weiner A

15 10

Higginbottom Hadlock A Rose Warsof Woo Hadlock C

0 10 −5 −10

Higginbottom

25 20 15

Weiner A

10

Weiner B

5

Woo

0

Warsof

−5

0

Hadlock C

Hadlock B

Hadlock A Ferrero

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75

Rose

−10

Shepard

−15 −20 −25 −30

Thurnau

Precision (SD of % error in weight estimate)

Figure 2 Plot of bias vs. precision for 13 fetuses whose estimated weight was < 1000 g, using 12 formulae14 – 23 . There is only one published formula (Rose14 ) within the 5% bias and 15% precision, as shown in Figure 1. The performance of virtually all the formulae is much worse for infants whose estimated weight is < 1000 g. By widening the tolerance of bias to 10% and precision to 25%, a further four publishd formulae are included. None of the Hadlock17,18 formulae performed well in these small babies.

of variance, with Levene’s test to compare variances. Comparison of variances for intraobserver percentage error and percentage error relative to actual birth weight involved related samples (one sonographer estimating one sample of fetuses). The method used was to calculate the correlation between the sum and the difference between the two types of percentage error (intraobserver and actual vs. estimated)25 . This study was considered to be an audit by the local Ethics Committee.

RESULTS Reliability of Rose and McCallum fetal weight estimates

25

5

30 Bias (mean % error in weight estimate)

and unbiased. We wanted to pinpoint the reason for any differences, not just to determine whether there was a discrepancy between estimated and actual weight. Therefore, we rejected the use of single measures such as mean square error or the intraclass correlation coefficient. Regression analysis was rejected because of its dependence on the range of birth weights, which made it particularly unsuitable for assessing equations across weight ranges. To compare equations, we used both Bland–Altman analyses and the percentage error method to investigate discrepancies between estimated and actual birth weight. Bland–Altman plots showed that it was necessary to transform the weights using logarithms, because the variability in the difference between estimated and actual birth weight increased with birth weight for all equations except that of Thurnau et al.21 . Therefore, the Bland–Altman analyses worked with the ratio of actual to estimated weights. The percentage error method also uses ratios but is more intuitive for clinicians. We have used the same definition of percentage error as that used by Edwards et al.8 : % error = 100 × (ABW − EFW)/EFW, where ABW = actual birth weight and EFW = estimated fetal weight. We agree with Edwards et al.8 that it is the potential error in estimated weight (EFW), which is relevant to a clinician making a management decision based on the ultrasound result, which is why we have used EFW not ABW as the denominator. The fraction of fetal weight estimates within 15% of actual birth weight is a common comparative measure1,2,15 – 18,20,23 . We have used this 15% standard error (SE) as a cut-off point for comparing the performance of the 12 published formulae in Figures 1 and 2. All analyses were carried out in SAS version 9.1. Comparison of bias across years used a t-test for independent groups. Comparison of bias and precision across weight ranges in 2000 used one-way analysis

Shepard

15

Table 2 shows that both inter- and intraobserver estimates showed little systematic disagreement. The variability in agreement produced 95% limits of agreement ranging from −4.4% to + 3.3%.

Thurnau

Accuracy of Rose and McCallum fetal weight estimates

Hadlock B

20

25

30

Ferrero

Precision (SD of % error in weight estimate)

Figure 1 Plot of bias vs. precision for 12 published formulae14 – 23 , plotting the mean percentage error in estimated fetal weight (EFW) of the complete cohort against the SD of percentage error in weight estimate of 192 fetuses (mean birth weight 2344 g, EFW < 1000 g in 13, 1000–1999 g in 55, 2000–2999 g in 75, and > 3000 g in 49). The best performing formulae are within the dotted box. The bias or mean percentage error is < 7% and the SD of percentage error is < 15% for this group of six. The best performing formulae are those performing closest to the origin of the graph.

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

On average, actual birth weights were higher than Rose and McCallum fetal weight estimates, although the bias was no more than 3% (Figure 1, Table 3). The 95% limits of agreement were approximately ± 20% (Table 3), considerably higher than those found when looking at reliability. Within the 1991 dataset, the variance between sonographers was significantly less than that between each sonographer and actual birth weight (P < 0.0001; Table 3). There were no significant differences in the Bland– Altman estimate of bias or the mean percentage error between the 1991 and 2000 datasets (P > 0.96), neither was there any difference in variability (P > 0.86). There was no improvement between 1991 and 2000 (Table 3).

Ultrasound Obstet Gynecol 2007; 30: 173–179.

Estimated fetal weight

177

Overall comparison of published models of estimation of fetal weight We compared how well estimated and actual weights agreed for 12 published methods of estimating fetal weight14 – 23 (Tables 2, 4 and 5). Table 4 reports absolute errors of estimation of birth weight as well as the percentage error. However, Bland–Altman plots showed clearly that absolute errors rose with mean weight (the average of estimated and actual weight), so that the absolute error found depended on the weight distribution of the babies; a sample of small babies would yield smaller absolute errors than a sample of large babies. The percentage error (Table 4 and Figure 1) is the more appropriate measure. Figure 1 shows that six of the equations (Rose14 , Warsof15 , Shepard16 , Hadlock A17 , Hadlock C18 , Woo20 ), in general, form a cluster in which the overall bias is within 7%, and the precision is within 15%. For the remaining six equations18,19,21 – 23 , there is either much greater bias or much poorer precision of the estimates.

Variation across the weight range for each equation The ideal formula for estimating fetal weight would show very little bias (< 1%) and a high level of precision (< 5%), with a high level of consistency across weight ranges. None of the formulae satisfies all these requirements. Warsof15 comes closest, with Rose14 and Shepard16 next. Woo20 performs moderately well for bias and precision in general, but is inconsistent. The remainder suffer from either marked bias or marked lack of precision. Nearly all equations perform worse if the estimated weight is < 1000 g (Figure 2, Table 5). There is poorer precision for nearly all formulae, and greater bias in some. The Hadlock formulae17,18 were particularly poor in the smaller infant, whereas performance was as good as any other formula

if the estimated fetal weight was > 1000 g (Table 5). This was due to a combination of substantial systematic underestimation of weight and very poor precision in the group of babies estimated to be < 1000 g (SD percentage error 37–50%).

DISCUSSION Sonographic fetal weight estimates are reliable (approximately 3% systematic disagreement) but relatively inaccurate (95% limits of agreement are about 20%). There was no change in accuracy between 1991 and 2000. Twelve published formulae for estimating fetal weight show considerable variation in bias, precision and consistency. There are three main findings from our study. First, the error in estimating fetal weight is predominantly due to the equation used, with observer error a relatively minor component. Second, there was no improvement in accuracy of estimates of fetal weight between 1991 and 2000. Third, estimating fetal weight in very low birth weight infants is subject to much greater error (poorer precision) than it is in larger babies. In addition, as we have justified earlier in this paper, comparison of methods of estimating fetal weight requires statistical methods that can separate out bias, precision and consistency. Comparison of methods of fetal weight estimation would yield different conclusions if only bias or only precision were investigated. Using the Rose and McCallum formula14 , we have shown that there is virtually no inter- or intraobserver bias in estimation in fetal weight, and there is agreement between and within observers of −3% to + 3%. This is in marked contrast to the differences between actual and estimated birth weight, where there is systematic underestimation of birth weight of 2–3% and the actual

Table 4 Comparison of performance of 12 equations for estimating fetal weight in 192 deliveries: absolute error and percentage error* Actual vs. estimated birth weight

Equation reference Rose and McCallum14 Warsof et al.15 Shepard et al.16 Hadlock et al. A17 Hadlock et al. B18 Hadlock et al. C18 Ferrero et al.19 Woo et al.20 Thurnau et al.21 Weiner et al. A22 Weiner et al. B22 Higginbottom et al.23

Mean estimated birth weight (g)†

Mean difference (actual–estimated weight) (g)

SD of difference (g)

Mean % error bias‡

SD % error precision§

2262 2240 2356 2226 2299 2276 2538 2256 2116 1998 1920 2243

72 94 −22 108 35 58 −204 78 218 336 414 91

259 262 264 248 270 244 292 248 376 265 328 335

4.04 4.77 −0.54 6.09 3.34 3.96 −6.88 3.80 7.30 18.38 22.16 8.29

11.30 12.64 11.91 14.81 17.78 14.67 17.07 11.96 17.13 13.87 16.72 25.78

*Percentage error = 100 × (ABW − EFW)/EFW, where ABW = actual birth weight and EFW = estimated fetal weight; n = 192, not 208, as some fetuses were missing a measurement required for some equations. †Actual mean birth weight = 2334 (range 400–4450) g. ‡Larger values (whether positive or negative) indicate greater bias. §Higher values indicate poorer precision.

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

Ultrasound Obstet Gynecol 2007; 30: 173–179.

Anderson et al. *Larger values (whether positive or negative) indicate greater bias. †Higher values indicate poorer precision. ‡Higher values indicate better consistency (no significant differences across the weight groups); significant P-values indicate significant differences in bias or precision between different weight groups. ¶Consistency of mean % error across weight ranges. #Consistency of variance of error across weight ranges. §Very poor consistency due to large error in EFW < 1000 g. % error = 100 × (ABW − EFW)/EFW, where ABW = actual birth weight and EFW = estimated fetal weight.

0.59 0.27 0.19 0.002§ 0.001§ 0.002§ 0.006§ 0.02 0.20 0.27 0.77 0.01 0.02 0.11 0.05 0.55 0.11 0.19 0.16 0.07 < 0.0001 0.30 0.0004 0.06 12.5 13.48 12.61 13.35 13.19 12.85 10.60 12.56 10.84 16.92 17.80 15.69 3.46 7.24 2.45 6.83 3.57 4.34 −4.21 7.31 25.69 16.85 25.72 3.71 10.38 11.07 10.22 10.28 11.96 9.94 15.16 9.60 9.18 11.89 16.19 24.38 5.09 5.4 −0.05 6.12 2.61 3.44 −7.53 3.72 9.69 20.19 25.11 8.47 11.08 11.82 11.14 12.31 13.00 11.94 12.52 11.18 9.00 13.49 14.63 17.12 9.43 18.73 18.10 37.52 50.31 37.81 44.08 20.76 14.15 12.96 13.82 68.84 Rose and McCallum14 Warsof et al.15 Shepard et al.16 Hadlock et al. A17 Hadlock et al. B18 Hadlock et al. C18 Ferrero et al.19 Woo et al.20 Thurnau et al.21 Weiner et al. A22 Weiner et al. B22 Higginbottom et al.23

−4.78 −1.45 −6.29 10.56 14.66 12.12 −0.09 1.63 −22.81 13.18 7.03 25.53

5.22 3.17 −2.51 4.30 1.43 2.41 −9.99 1.29 −5.21 18.53 18.54 8.05

P for variance of % error‡# P for mean % error‡¶

SD % error precision† Mean % error bias* SD % error precision† Mean % error bias* SD % error precision† Mean % error bias* SD % error precision† Mean % error bias* Equation reference

ABW 2000–2999 g (n = 75) ABW 1000–1999 g (n = 55) ABW < 1000 g (n = 13)

Table 5 Comparison of percentage error in estimated fetal weight (EFW) across weight groups for 12 equations for 192 deliveries

ABW > 3000 g (n = 49)

Consistency

178

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

weight varies from 27% below the estimate to 18% above it (± 2 SD). The six published formulae that performed best in our dataset systematically over- or underestimate fetal weight by up to 7%, with one SD of the error of up to 15%. The other six formulae showed more bias, more imprecision, or both. The error is even greater when the actual birth weight is < 1000 g. The commonly used Hadlock formulae17,18 were particularly poor when used with smaller babies, systematically underestimating the actual weight by 10–14% (SD 37–50%). It would be foolhardy to make critical clinical decisions based on fetal weight alone with this degree of unreliability of estimating fetal weight. An alternative might be to use a different formula for the very small baby. As we had only 13 infants in this low birth weight group, we cannot make sweeping assumptions or give clear guidelines as to which of the formulae might be better for use with very low birth weight infants. Others have identified the special problems of estimating fetal weight in very small infants, but none of these studies showed such a marked error as in our study. Others have found SD percentage error of 10–15% in very small babies1,8,9 , whereas we found SD 50% (Table 5). Other authors have found that estimated fetal weight models in general underestimate the weight of infants < 1000 g9,12,26 . As can be seen from Table 5, we have found that while some models significantly underestimate weight in the small infant, others overestimate it, and yet others, such as Woo20 , Warsof15 and Ferrero19 , have minimal bias in these small babies. Other authors have addressed the difficulties in estimating the weight of very small fetuses26,27 . In this study, we have used appropriate statistical methods to allow us to pinpoint the causes of the error in estimating fetal weight. We have shown that observer error is a very minor component. This has allowed us to focus on the error components of the regression equations themselves. The importance of bias, precision, and consistency in comparing published equations for estimating fetal weight can be seen when comparing Tables 4 and 5. Bias (the difference between actual and estimated) and precision (the SD of this difference) may show systematic under- or overestimation of weight, or be different in the various birth weight groups. Looking at Table 4 alone, Thurnau21 and Hadlock A17 look very similar, but Thurnau has marked inconsistency of bias across the weight range, whereas Hadlock A is consistent. Woo20 has similar overall bias and precision to Warsof15 , but Woo shows lack of consistency across the weight range, in contrast to the consistency shown by Warsof. Some of the formulae we have tested are better than others. When a formula or equation for estimating fetal weight is developed, the coefficients calculated for a given dataset will be the best in some sense. However, the equations may be inappropriate in that variables may be missing or their form mis-specified (e.g. linear where there should be a quadratic term as well). Furthermore,

Ultrasound Obstet Gynecol 2007; 30: 173–179.

Estimated fetal weight even if the model is appropriate, the coefficients estimated from the ‘training set’ may not be appropriate for other datasets. The disadvantages of our study are that some of the data are now old, and few of the infants were very small or very large. The preponderance of highrisk pregnancies in our study can be regarded as an advantage or disadvantage, depending on viewpoint. We have used models of estimated fetal weight which use linear measurements only; we have not attempted to use three-dimensional ultrasound models28,29 . We have not allowed for any increased growth within the 1–7 days prior to birth. Some authors have used a correction of 25 g for each day between the ultrasound measurements and delivery11 . The inaccuracies of estimating fetal weight have been recognized for a long time. Our study has used a methodology for analyzing the sources of error in estimating fetal weight that can serve as a blueprint for analyzing future developments in estimating fetal weight.

179

11. 12.

13.

14.

15.

16.

17.

18.

ACKNOWLEDGMENTS We thank all the sonographers and pregnant women involved in this study. We thank Judith Dawson for typing and layout of the manuscript.

REFERENCES 1. Kaaij MW, Struijk PC, Lotgering FK. Accuracy of sonographic estimates of fetal weight in very small infants. Ultrasound Obstet Gynecol 1999; 13: 99–102. 2. Nahum GG, Nahum GG, Stanislaw H. Ultrasonographic prediction of term birth weight: how accurate is it? Am J Obstet Gynecol 2003; 188: 566–574. 3. Watson WJ, Soisson AP, Harlass FE. Estimated weight of the term fetus. Accuracy of ultrasound vs. clinical examination. J Reprod Med 1988; 33: 369–371. 4. Baum JD, Gussman D, Wirth JC III. Clinical and patient estimation of fetal weight vs. ultrasound estimation. J Reprod Med 2002; 47: 194–198. 5. Rogers MS, Chung TK, Chang AM. Ultrasound fetal weight estimation: precision or guess work? Aust N Z J Obstet Gynaecol 1993; 33: 142–144. 6. Kurmanavicius J, Burkhardt T, Wisser J, Huch R. Ultrasonographic fetal weight estimation: accuracy of formulas and accuracy of examiners by birth weight from 500 to 5000 g. J Perinat Med 2004; 32: 155–161. 7. De Vet H. Observer reliability and agreement. In Encyclopedia of Biostatistics, Armitage P, Colton T (eds). Wiley: Chichester, 1998; 3123–3127. 8. Edwards A, Goff J, Baker L. Accuracy and modifying factors of the sonographic estimation of fetal weight in a highrisk population. Aust N Z J Obstet Gynaecol 2001; 41: 187–190. 9. Pinette MG, Pan Y, Pinette SG, Blackstone J, Garrett J, Cartin A. Estimation of fetal weight: mean value from multiple formulas. J Ultrasound Med 1999; 18: 813–817. 10. Mirghani HM, Weerasinghe S, Ezimokhai M, Smith JR. Ultrasonic estimation of fetal weight at term: an

Copyright  2007 ISUOG. Published by John Wiley & Sons, Ltd.

19.

20.

21.

22.

23.

24.

25. 26.

27.

28.

29.

evaluation of eight formulae. J Obstet Gynaecol Res 2005; 31: 409–413. Chien PF, Owen P, Khan KS. Validity of ultrasound estimation of fetal weight. Obstet Gynecol 2000; 95: 856–860. Jouannic JM, Grange G, Goffinet F, Benachi A, Carbrol D. Validity of sonographic formulas for estimating fetal weight below 1250 g: a series of 119 cases. Fetal Diagn Ther 2001; 16: 254–258. Donma MM, Donma O, Sonmez S. Prediction of birth weight by ultrasound in Turkish population. Which formula should be used in Turkey to estimate fetal weight? Ultrasound Med Biol 2005; 31: 1577–1581. Rose BI, McCallum WD. A simplified method for estimating fetal weight using ultrasound measurements. Obstet Gynecol 1987; 69: 671–675. Warsof SL, Gohari P, Berkowitz RL, Hobbins JC. The estimation of fetal weight by computer-assisted analysis. Am J Obstet Gynecol 1977; 128: 881–892. Shepard MJ, Richards VA, Berkowitz RL, Warsof SL, Hobbins JC. An evaluation of two equations for predicting fetal weight by ultrasound. Am J Obstet Gynecol 1982; 142: 47–54. Hadlock FP, Harrist RB, Carpenter RJ, Deter RL, Park SK. Sonographic estimation of fetal weight. The value of femur length in addition to head and abdomen measurements. Radiology 1984; 150: 535–540. Hadlock FP, Harrist RB, Sharman RS, Deter RL, Park SK. Estimation of fetal weight with the use of head, body, and femur measurements – a prospective study. Am J Obstet Gynecol 1985; 151: 333–337. Ferrero A, Maggi E, Giancotti A, Torcia F, Pachi A. Regression formula for estimation of fetal weight with use of abdominal circumference and femur length: a prospective study. J Ultrasound Med 1994; 13: 823–833. Woo JS, Wan CW, Cho KM. Computer-assisted evaluation of ultrasonic fetal weight prediction using multiple regression equations with and without the fetal femur length. J Ultrasound Med 1985; 4: 65–67. Thurnau GR, Tamura RK, Sabbagha R, Depp OR III, Dyer A, Larkin R, Lee T, Laughlin C. A simple estimated fetal weight equation based on real-time ultrasound measurements of fetuses less than thirty-four weeks’ gestation. Am J Obstet Gynecol 1983; 145: 557–561. Weiner CP, Sabbagha RE, Vaisrub N, Socol ML. Ultrasonic fetal weight prediction: role of head circumference and femur length. Obstet Gynecol 1985; 65: 812–817. Higginbottom J, Slater J, Porter G, Whitfield CR. Estimation of fetal weight from ultrasonic measurement of trunk circumference. Br J Obstet Gynaecol 1975; 82: 698–701. Bland MJ, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1: 307–310. Armitage P, Berry G, Matthews JNS. Statistical Methods in Medical Research, 4th edn. Blackwell Science: Oxford, 2002. Medchill MT, Peterson CM, Kreinick C, Garbaciak J. Prediction of estimated fetal weight in extremely low birth weight neonates (500–1000 g). Obstet Gynecol 1991; 78: 286–290. Scott F, Beeby P, Abbott J, Edelman D, Boogert A. New formula for estimating fetal weight below 1000 g: comparison with existing formulas. J Ultrasound Med 1996; 15: 669–672. Schild RL, Fimmers R, Hansmann M. Fetal weight estimation by three-dimensional ultrasound. Ultrasound Obstet Gynecol 2000; 16: 445–452. Lee W, Deter RL, Ebersole JD, Huang R, Blanckaert K, Romero R. Birth weight prediction by three-dimensional ultrasonography: fractional limb volume. J Ultrasound Med 2001; 20: 1283–1292.

Ultrasound Obstet Gynecol 2007; 30: 173–179.