Article in press - uncorrected proof Clin Chem Lab Med 2010;48(11):1593–1601 2010 by Walter de Gruyter • Berlin • New York. DOI 10.1515/CCLM.2010.315
Research Article
Common reference intervals for aspartate aminotransferase (AST), alanine aminotransferase (ALT) and g-glutamyl transferase (GGT) in serum: results from an IFCC multicenter study
Ferruccio Ceriotti1,*, Joseph Henny2, Josep ¨ zarda5, Baorong ¸ O Queralto´3, Shen Ziyu4, Yesim 6 7 Chen , James C. Boyd and Mauro Panteghini8 on behalf of the IFCC Committee on Reference Intervals and Decision Limits (C-RIDL) and Committee on Reference Systems for Enzymes (C-RSE) 1
Diagnostica e Ricerca S. Raffaele, Istituto Scientifico Universitario S. Raffaele, Milano, Italy 2 Laboratoire de Biologie Clinique, Centre de Me´dicine Pre´ventive, Vandoeuvre-le`s-Nancy, France 3 Servei de Bioquı´mica, Hospital de la Santa Creu i Sant Pau, Barcelona, Spain 4 National Center for Clinical Laboratory, Beijing Hospital Compound, Beijing, P.R. China 5 Uludag University Medical School, Department of Biochemistry and Clinical Biochemistry, Bursa, Turkey 6 Beijing Aerospace General Hospital, Beijing, P.R. China 7 University of Virginia Health System, Department of Pathology, Charlottesville, VA, USA 8 Centro Interdipartimentale per la Riferibilita` Metrologica in Medicina di Laboratorio (CIRME), Universita` degli Studi, Milano, Italy
Abstract Background: Aspartate aminotransferase (AST), alanine aminotransferase (ALT) and g-glutamyl transferase (GGT) measurements are important for the assessment of liver damage. The aim of this study was to define the reference intervals (RIs) for these enzymes in adults, paying attention to standardization of the methods used and careful selection of the reference population. Methods: AST, ALT and GGT were measured with commercial analytical systems standardized to the IFCC-recommended reference measurement systems. Three centers (two in Italy and one in China) measured their own freshly collected samples; one of these centers also measured frozen samples from the Nordic Countries RI Project and from a Turkish center. RIs were generated using non-parametric *Corresponding author: Ferruccio Ceriotti, Diagnostica e Ricerca S. Raffaele, Via Olgettina 60, 20132 Milano, Italy Phone: q39 0226432282, Fax: q39 0226432640, E-mail:
[email protected] Previously published online October 30, 2010
techniques from the results of 765 individuals (411 females and 354 males, 18–85 years old) selected on the basis of the results of other laboratory tests and a specific questionnaire. Results: AST results from the four regions (Milan, Beijing, Bursa and Nordic Countries) were statistically different, but these differences were too small to be clinically relevant. Likewise, differences between the upper reference limits for genders was only 1.7 U/L (0.03 mkat/L), allowing a single RI of 11–34 U/L (0.18–0.57 mkat/L) to be defined. Interregional differences were not statistically significant for ALT, but partitioning was required due to significant gender differences. RIs for ALT were 8–41 U/L (0.13–0.68 mkat/L) for females and 9–59 U/L (0.15–0.99 mkat/L) for males, respectively. The upper reference limits for GGT from the Nordic Country population were higher than those from the other three regions and results from this group were excluded from final calculations. The GGT RIs were 6–40 U/L (0.11–0.66 mkat/L) for females and 12–68 U/L (0.20– 1.13 mkat/L) for males, respectively. Conclusions: For AST and ALT, the implementation of common RIs appears to be possible, because no differences between regions were observed. However, a common RI for GGT that is applicable worldwide appears unlikely due to differences among populations. Clin Chem Lab Med 2010;48:1593–601. Keywords: alanine aminotransferase; aspartate aminotransferase; g-glutamyl transferase; reference intervals.
Introduction The interpretation of medical laboratory data is a comparative decision-making process, and reference values play an essential role in this process. Although the definition of disease-specific decision limits will likely supersede the use of reference values in the future, at present the definition of reference values is still paramount (1). The introduction by the International Federation of Clinical Chemistry (IFCC) and Laboratory Medicine of newly optimized methods (at 378C) for the measurement of the catalytic activity of several enzymes (2–4) generated the problem of the redefinition of appropriate reference intervals (RIs). The definition of RIs is a complex process and has been well-described in the recently reviewed Clinical and Laboratory Standards Institute
2010/503
Article in press - uncorrected proof 1594 Ceriotti et al.: AST, ALT and GGT multicenter reference intervals
(CLSI)–IFCC document C28-A3 (5). It is difficult for a single clinical laboratory to establish RIs properly, particularly for tests whose results are gender and/or age dependent. However, the increasing availability of commercial assays, traceable to the IFCC reference procedures, is eliminating (or at least greatly reducing) the dependency of test results on the method used for measurement and the need for separate RI studies, unless regional population differences are present (6). Currently, there is only limited evidence that ethnic or behavioral differences produce clinically relevant or even statistically significant differences in reference values. Providing that there are no regional population differences, multicenter studies to develop common RIs provide an ideal solution (7, 8). Such studies can be conducted in two different ways: 1) where each participating laboratory performs its own measurements was described in paragraph 5.2 of the CLSI–IFCC document (5)x or 2) where each center collects the samples from reference individuals, stores them at an appropriate temperature and ships them to a central laboratory that performs all the measurements. The first approach has the disadvantage of requiring thorough verification of the standardization of measurements performed by each participating laboratory, but uses fresh samples, exactly as in clinical practice. The second approach does not require alignment of measurements performed in different laboratories, but uses frozen samples, thereby introducing an additional step in the pre-analytical phase that may affect the results. Multicenter studies, if properly conducted, allows the identification of population differences, the presence of which, at least for some analytes, has been suggested by Ichihara et al. (9, 10), thus requiring caution in the adoption of common RIs (11). The aim of this work was to define RIs for aspartate aminotransferase (AST) wL-aspartate: 2-oxoglutarate-aminotransferase, EC 2.6.1.1x, alanine aminotransferase (ALT) wL-alanine: 2-oxoglutarate aminotransferase, EC 2.6.1.2x and g-glutamyl transferase (GGT) w(g-glutamyl)-peptide: amino acid g-glutamyltransferase, EC 2.3.2.2x that could be applied by all clinical laboratories using an assay system providing results traceable to the corresponding IFCC primary reference measurement procedure. A related goal was to determine, whether ethnic or other population-based differences for these three enzymes were present that would require development of separate RIs.
medizin (DGKL) and Laborafx for value targeting. At each center the materials were analyzed in duplicate in three different analytical series (six measurements per center) using the IFCC reference measurement procedures (2–4) with home-brew reagents. To check the internal variability of assays employed in the participating laboratories, the Bio-Rad Liquid Assayed Multiqual material (QC) (Bio-Rad Laboratories Srl, Segrate, Italy) was used.
Structure of the experiments Three clinical center laboratories were originally involved in the collection and measurements of reference samples (two in Italy and one in China). Additional samples were obtained frozen (–808C) from the NOBIDA Biobank wsamples from the Nordic Countries Reference Intervals Project – NORIP – (14)x and from a Turkish laboratory (located in Bursa). These samples were analyzed in one of the Italian laboratories that were collecting samples. Analyses were performed on three commercial analytical systems wSiemens Advia 2400, Roche Hitachi Modular and Hitachi 7170A (equivalent to Hitachi 917)x employing the manufacturer’s reagents, calibrators, and control materials for each analyzer and strictly following the manufacturer’s instructions. Compliance with the in vitro diagnostic (IVD) directive was indicated through the CE (‘‘Communaute´s Europe´ennes’’) marking of conformity on assays (15). Reagents for analysis of the transaminases included pyridoxal-5-phosphate.
Standardization of participating laboratories
Materials and methods
Each participating laboratory received five frozen 0.5 mL aliquots of ‘‘trueness control material’’ (TCM) (three levels) and 25 vials per level (two levels) of frozen QC material for evaluation of intralaboratory variability. The control material was shipped on dry ice. To check analytical trueness, the participating laboratories first performed triplicate measurements of the three TCMs in five different analytical runs (15 results for each TCM). If results were not aligned with the expected values assigned by the reference laboratories, the obtained results were used to recalibrate the measuring systems and assure optimal traceability to the corresponding IFCC reference procedures. To calculate the correction factors, a regression equation was calculated comparing the target values assigned by the reference laboratories and the overall means obtained by the routine methods (from 21 to 38 replicates, according to the center performing the analysis). It was decided to neglect the intercepts, forcing the regression line through zero to avoid overcorrection related to extrapolation to zero, when analyzing samples with enzyme activities in the range from approximately 40 U/L to approximately 100 U/L. QC materials were first measured together with the TCM and then used to check the quality of the analytical measurements during the measurement of samples from reference individuals (duplicate measurements at the beginning and end of each analytical run of about 10 reference samples).
Control materials
Selection of the reference population
Frozen native human samples from a single donation, collected according to the protocol in CLSI C37-A (12) and enriched with human recombinant enzymes, were prepared by MCA laboratories, The Netherlands, as described previously (13). Vials containing these materials at three different concentrations, were used for trueness verification. They were sent on dry ice to three laboratories included in the Joint Committee for Traceability in Laboratory Medicine (JCTLM) list of reference measurement services for enzymes wCentro Interdipartimentale per la Riferibilita` Metrologica (CIRME), Deutsche Gesellschaft fu¨r Klinische Chemie und Laboratoriums-
The eligible reference individuals signed an informed consent and underwent an anamnestic interview with a physician using an ‘ad hoc’ questionnaire. Clinical exclusion criteria were the following: presence of diabetes mellitus, myopathies, burns and muscle trauma, hypothyroidism, chronic nephropathy, acute and chronic infection, hepatobiliary disease, use of therapeutic drugs with an effect on serum enzyme activities (i.e., warfarin, antiepileptics, diphenylhydantoin, aminopyrin, antidepressants, analgesics, antibiotics), pregnancy, body mass index (BMI) )30 kg/m2, alcohol assumption )30 g/day, or heavy exercise in the previous days.
Article in press - uncorrected proof Ceriotti et al.: AST, ALT and GGT multicenter reference intervals 1595
Table 1 Laboratory tests and corresponding thresholds used as exclusion criteria. Laboratory test
Threshold
Plasma fasting glucose Serum creatinine
)7.0 mmol/L (1.26 g/L) )18 mmol/L (2 mg/L) above local upper reference limits )5 mkat/L (300 U/L) )20 mg/L )475 mmol/L (80 mg/L) )3.39 mmol/L (3.00 g/L) )7.76 mmol/L (3.00 g/L) -32 g/L -4.0 or )5.9 mil/mL (males) -3.4 or )5.2 mil/mL (females) -120 g/L (males) -110 g/L (females) -3000/mL or )12,500/mL -100/nL -36% or )52% (males) -33% or )47% (females) -70 fL or )97 fL Positive
Serum creatine kinase Serum C-reactive protein Serum uric acid Serum triglycerides Serum total cholesterol Serum albumin Blood erythrocytes Blood haemoglobin Blood white blood cells Blood platelets Hematocrit
Mean cellular volume Hepatitis B (HB) surface antigen IgG antibodies anti-HB core Positive antigen Anti-hepatitis C virus antibodies Positive
Blood was collected from each individual after an overnight fast and serum was obtained after centrifugation. In addition to the aliquot for the RI study, other blood samples were collected to exclude known causes of AST, ALT, or GGT increases (e.g., subclinical liver or muscle disease) using other laboratory investigations. The tests performed and their cut-off thresholds for subject exclusion are shown in Table 1.
Statistical evaluation The results from reference individuals were analyzed using RefVal 4.0 software (16). Based on previous data in the literature indicating gender differences (17, 18) for the enzyme catalytic activities evaluated in serum, data for males and females was analyzed separately. A preliminary evaluation for the presence of outliers was performed, but it was decided not to exclude any observation unless clear analytical or biological reasons could be demonstrated.
Results Value assignment to trueness materials and alignment of laboratory assays
The TCMs were first analysed by reference laboratories. The mean of the means ("SD) of results for TCM levels 1, 2, and 3 obtained in these laboratories were 41.4 U/L ("1.63 U/L) w0.69"0.027 mkat/Lx, 70.8 U/L ("1.69 U/L) w1.18"0.028 mkat/Lx, and 97.3 U/L ("1.84 U/L) w1.622" 0.031 mkat/Lx for AST, 39.7 U/L ("2.18 U/L) w0.662" 0.036 mkat/Lx, 67.6 U/L ("2.61 U/L) w1.127"0.044 mkat/Lx, and 96.9 ("2.85 U/L) w1.615"0.048 mkat/Lx for ALT, 38.3 U/L ("0.90 U/L) w0.638"0.015 mkat/Lx,
75.8 U/L ("1.74 U/L) w1.263"0.029 mkat/Lx, and 96.6 U/L ("0.87 U/L) w1.61"0.015 mkat/Lx for GGT, respectively. These results were used as target values for standardization purposes of the participating laboratories that measured the same TCMs for verification of the traceability using their commercial assay systems. Figure 1 shows the results obtained compared with the results obtained by the reference laboratories. As can be seen for ALT, the overlap between the commercial systems and reference procedure values was almost complete; only the high-level TCM in Beijing was outside the limits, but since the discrepancy was seen only for values well above the RI, no data correction was used. For AST, results from Milan 2 did not match the reference procedure intervals, thus that reference values obtained from this laboratory were recalculated by multiplying by a correction factor of 0.933 obtained from the corresponding regression equation. For GGT, the data for all centers were recalculated using the following correction factors: Milan 1s0.965, Milan 2s1.046, and Beijings1.031. Reference individuals
After application of the exclusion criteria, a total of 810 individuals (142 Asians and 668 Caucasians) were enrolled in the study: 188 (101 females, 87 males) from Milan (two centers), 136 (61 females, 75 males) from Beijing, 356 from NOBIDA Biobank (188 females, 168 males), and 130 from Bursa (79 females, 51 males). However, 45 of these individuals (39 Asians and six Caucasians) were excluded from the calculation because they were younger than 18 years. Characteristics of the four population groups are shown in Table 2, and the distribution of the individuals according to age and ethnicity are shown in Table 3. For AST and GGT, we observed a weak but statistically significant (pF0.05) correlation with age and BMI for both genders (r2-0.09), while ALT was only slightly related to the BMI (r2s0.031 in females and 0.076 in males, respectively). Reference intervals
No data was eliminated as statistical outliers. RIs were computed non-parametrically for each center, and the Lahti algorithm (19, 20) was applied to the corresponding upper reference limits to judge, whether partitioning of the data was required. Because the Lahti approach can be applied to a pair of groups only, each regional group was compared with the other three (a total of six comparisons) (Table 4). For none of the three enzymes did the algorithm indicate the need for partitioning of a particular group in all the pair wise comparisons with other groups. AST Females AST activities across the four groups were sig-
nificantly different (one-way analysis of variance; ANOVA), but the Lahti algorithm indicated no need of partitioning for
Article in press - uncorrected proof 1596 Ceriotti et al.: AST, ALT and GGT multicenter reference intervals
Figure 1 Results obtained on the ‘‘trueness control material’’ by the three participating laboratories (means"SD are indicated for each enzyme and at each level). Grey areas between dashed lines represent the mean"SD area obtained with results by the reference laboratories.
Table 2 Characteristics of the reference population groups enrolled in the study. Females
Milan Beijing NORIP Bursa
Males
Number of individuals
Age, yearsa
BMI, kg/m2a
Number of individuals
Age, yearsa
BMI, kg/m2a
97 47 188 79
40 47 52 32
21.2 21.7 23.1 22.1
85 50 168 51
42 (18–76) 42 (22–76) 51.5 (18–85) 31 (20–59)
24.4 23.1 24.8 23.9
(21–72) (21–73) (19–90) (20–54)
(16.2–28.7) (18.0–29.7) (17.3–29.8) (16.9–30.0)
(18.4–29.4) (18.4–29.3) (19.2–29.9) (17.0–29.4)
a
Median (range). BMI, body mass index.
Table 3 Distribution of reference individuals according to age (years) and ethnicity. Age
18–30 31–40 41–60 61–85 Total
Males
Females
Caucasian
Asian
Total
Caucasian
Asian
Total
70 61 98 72 301
12 11 21 9 53
82 72 118 81 354
96 78 121 68 363
15 3 23 7 48
111 81 144 75 411
Article in press - uncorrected proof
The results from each center are compared with the others using the Lahti algorithm to judge the need for partitioning: P, partitioning suggested (more than 4.1% of the population of a single group lies outside the limits described by the distribution of the common group); Np, no need of partitioning (-3.2% of the population of a single group lies outside the limits described by the distribution of the common group); M, marginal (a percentage between 3.2 and 4.1 of the population of a single group lies outside the limits described by the distribution of the common group).
P M M Np Np – M – – P M M Np – – M Np – M Np – M M M Beijing Bursa Milan
Np – –
M Np –
P – –
Np P –
Np M Np
Np – –
P Np Np
M – –
Np M Np
Np Np –
Milan Bursa Milan Bursa NORIP Milan Bursa Milan NORIP Bursa
Milan
Bursa
Milan
NORIP
Bursa
NORIP
ALT males ALT females AST males AST females
Table 4 Differences between upper reference limits obtained in individuals enrolled in different centers.
GGT females
NORIP
GGT males
NORIP
Ceriotti et al.: AST, ALT and GGT multicenter reference intervals 1597
derivation of RI (Table 4). Combining results from all groups (411 individuals), the RIs shown in Table 5 were obtained. Males AST activities across the four groups were statisti-
cally different (one-way ANOVA); the Lahti algorithm suggested partitioning for two of six comparisons (Table 4). However, the absolute differences in activities between regions were small (95th percentile values ranged from a minimum of 29.8 U/L w0.50 mkat/Lx in Milan to a maximum of 35.5 U/L w0.59 mkat/Lx in Bursa). Thus, all data (354 individuals) was merged. The resulting RIs are shown in Table 5. Although the median of the male group (21.0 U/L w0.35 mkat/Lx) was 3 U/L (0.05 mkat/L) higher than the female group (18.0 U/L w0.30 mkat/Lx), the corresponding upper reference limits were closer. The Lahti algorithm indicated only a marginal statistical difference. Thus, it was possible to estimate a single RI for both genders based on the entire group of reference values (ns765) (Table 5). ALT Females The Beijing group appeared to be somewhat dif-
ferent from the other regional groups (marginal or partitioning indication according to Lahti, see Table 4). However, because of the limited number of individuals in this group (ns47), we decided to combine all 411 data points. The resulting RIs are shown in Table 5. Males Two clearly abnormal results were present (one in
the Milan group: 113 U/L w1.88 mkat/Lx and the second in the NORIP group: 123 U/L w2.05 mkat/Lx). Since there were no evident reasons for explaining these results, they were not excluded. Although the upper reference limit of the Beijing group had marginal statistical differences from the other groups, these differences were judged to be clinically irrelevant and all the data was merged (Table 5). GGT Females The 97.5th percentile value for the NORIP group
was markedly different from those of other three regions. Even after elimination of an outlier (173 U/L–2.88 mkat/L), the 97.5th percentile value was 68.9 U/L (1.15 mkat/L), whereas the 97.5th percentile values for the other groups ranged from 39.0 U/L (0.65 mkat/L) (Beijing) to 44.1 U/L (0.74 mkat/L) (Milan). Based upon this difference, the NORIP data were excluded from the final calculation, even if the Lahti algorithm indicated a marginal statistical difference (Table 4). The RIs obtained for the remaining 223 individuals are shown in Table 5. Males As seen for females, the 97.5th percentile value for
males in the NORIP group (113.9 U/L w1.90 mkat/Lx) was markedly different from those of other three groups which ranged from 64.2 U/L (1.07 mkat/L) (Beijing) to 69.1 U/L (1.15 mkat/L) (Milan). Thus, even for males the RI was calculated by excluding the NORIP data. The resulting reference limits (on 186 individuals) are shown in Table 5.
Article in press - uncorrected proof 1598 Ceriotti et al.: AST, ALT and GGT multicenter reference intervals
Table 5 Proposed reference intervals for the three enzymes that were evaluated. Females
AST U/L mkat/L ALT U/L mkat/L GGTa U/L mkat/L
Males
Cumulative
2.5th percentile (90% CI)
97.5th percentile (90% CI)
2.5th percentile (90% CI)
97.5th percentile (90% CI)
2.5th percentile (90% CI)
97.5th percentile (90% CI)
11.0 (10.0–12.0) 0.18 (0.17–0.20)
33.4 (30.0–35.5) 0.56 (0.50–0.59)
13.9 (11.0–14.7) 0.23 (0.18–0.25)
35.1 (33.0–41.0) 0.59 (0.55–0.68)
11.0 (10.0–12.0) 0.18 (0.17–0.20)
34.0 (32.8–36.0) 0.57 (0.55–0.60)
7.8 (6.0–8.0) 0.13 (0.10–0.13)
41.0 (35.0–57.0) 0.68 (0.58–0.95)
9.0 (7.8–10.0) 0.15 (0.13–0.17)
59.0 (51.0–62.0) 0.98 (0.85–1.03)
6.4 (4.8–6.8) 0.11 (0.08–0.11)
39.7 (32.0–58.0) 0.66 (0.53–0.97)
11.7 (10.3–12.5) 0.20 (0.17–0.21)
67.5 (48.3–100.4) 1.13 (0.81–1.67)
a
NORIP data excluded. CI, confidence interval.
Discussion AST, ALT and GGT are among the most commonly requested laboratory tests, and are key for the laboratory evaluation of liver damage. Values above the upper reference limits of these enzymes are often used to trigger further clinical and laboratory investigations. Incorrect setting of these limits can result in costly and invasive evaluations or, conversely, may allow life-threatening liver disease to progress unrecognized. Factors that influence the correctness of these limits can be divided into two categories: 1) differences in analytical methodologies, and 2) differences in the way the RIs were defined. The first factor can be minimized by the adoption of methods producing results traceable to the corresponding reference measurement system (6). The second factor is more relevant, as demonstrated by several authors reporting a wide range of RIs, even in laboratories using similar methods (21–25). Possible reasons to explain this phenomenon are numerous and include differences in the criteria used to select the reference individuals, the number of individuals selected, the statistical approach used to derive the reference limits, or the adoption by laboratories of manufacturers’ suggested RIs, often without any validation. The importance of improving both assay standardization and approaches for defining RIs has been discussed by Wu recently (26). In our study, we attempted to improve the current situation by obtaining sound reference values, following a welldefined methodology for the selection of the reference individuals and using analytical methods with demonstrated traceability. For effective exclusion of individuals with minor subclinical hepatic damage, such as that related to non-alcoholic fatty liver disease (NAFLD), ultrasound examination should have been performed. However, as screening for reference individuals by ultrasound was not economically feasible, we excluded individuals with conditions known to be correlated with NAFLD: obesity (BMI )30 kg/m2) and hyperlipidemia (total serum cholesterol and triglycerides )3 g/L) (27, 28). Other common causes of liver damage, including excessive alcohol consumption or viral hepatitis B or C, were also excluded. Samples obtained through the NOBIDA Biobank did not have hepatitis B or C testing per-
formed at the origin, but all the samples were tested for hepatitis B surface antigen and hepatitis C antibodies in the center performing the enzyme measurements. All the samples with very high GGT ()80 U/L) were also tested for carbohydrate-deficient transferrin (CDT), with the elimination of any sample with abnormal results (CDT )2%). Thus, our selected population had a low probability of liver abnormalities, even if subclinical cytomegalovirus or Epstein-Barr infection cannot be excluded. Analytically, the traceability of our results to the reference measurement system was verified through the use of TCMs that had values assigned by the reference procedures. The objective of our work was to define RIs that could be adopted by any laboratory using analytical systems that provide traceable results, but also to verify the possible existence (or not) of differences in the upper reference limits due to ethnicity or life habit. The latter was only partially achieved. In fact, the skewed distribution of the data and the gender-related differences require a larger number of study subjects to check for the possible effects of these factors. GGT in serum was the only enzyme that appeared to be markedly different in the Nordic Countries group vs. the other three regional populations for both males and females. The difference was caused essentially by a small subgroup of individuals with substantially higher GGT values that were not explained by the commonly known factors that increase GGT, such as drug intake (explicitly excluded by the individuals) or chronic alcohol consumption (excluded by CDT measurements). For AST, the differences between groups (regional or genderrelated), although sometimes statistically significant, were considered too small to be clinically important. This permitted merging of all the results to derive a single RI. The small difference between the upper reference limits found in the two genders was in agreement with previous literature. Indeed, Schumann and Klauke (29) found a 4 U/L difference (31 vs. 35 U/L) and Leino et al. a 3 U/L difference (33 vs. 36 U/L) (17). The available literature on ALT reference values is huge, owing to the relevance of this measurement in Transfusion Medicine. The reported upper reference limits for ALT
Article in press - uncorrected proof Ceriotti et al.: AST, ALT and GGT multicenter reference intervals 1599
ranged from 19 U/L and 31 U/L for females and males, respectively, in the report by Prati et al. (30) to 46 U/L and 68 U/L, respectively, in the Nordic Project (31). However, the study of Prati et al. had several different characteristics: data were obtained using an unoptimized assay (no pyridoxal phosphate in the reagent mixture) and the upper reference limit was set at the 95th percentile and not at the 97.5th percentile of the data distribution. Furthermore, all the ALT results )40 U/L (males) and )30 U/L (females) were discarded a priori, and the BMI limit for subject inclusion was set at 24.9 kg/m2. For all the reported reasons, the limits derived by Prati et al., which were set not to evaluate healthy individuals, but to monitor anti-HCV–positive subjects over time, cannot be compared with those derived in the present study. However, our ALT data are comparable with those obtained by Chan et al. in a Chinese population (upper reference limits of 36 U/L and 53 U/L for females and males, respectively), although they used an assay that did not contain pyridoxal phosphate (32). However our results are substantially higher than those obtained by Brinkmann et al. on blood donors (females, 34.4 U/L; males, 43.9 U/L) (33) and Schumann and Klauke who studied hospitalized patients (females, 34 U/L; males, 45 U/L) (29). Moreover, ALT RIs in our study did not appear to be age-related, contrasting with results shown in some data mining studies where, especially for males, ALT values appear to increase in middle age (34, 35). This difference can likely be explained by our careful selection of individuals, whereas in data mining studies it is more difficult to exclude alcohol and therapeutic drug users. For AST and GGT, the correlation with age was statistically significant, particularly for women, but it was extremely weak and clinically irrelevant. Also, the effect of BMI, when limited below our pre-defined threshold of 30 kg/m2, appeared weak and did not affect the calculation of the upper reference limit. Insufficient data was available for children and adolescents to calculate specific RIs for these age groups. Although a recent publication (36) has found that the RIs for AST, ALT and GGT do not change after the first few years of age, thereafter being very similar to those found in adults, we decided not to include the data of children and adolescents in the calculation of RIs. GGT results from the Nordic Countries subgroup (upper reference limit of 69 U/L and 114 U/L for females and males, respectively) were similar to the original data published by Strømme et al. for individuals older than 40 years (77 U/L and 114 U/L) (31). This occurred despite the fact that our results were derived from a subset of approximately 350 of the original 1391 participants in the NORIP study, and our selection criteria were more stringent (45 out of 400 samples originally received were excluded primarily for excessive BMI, dyslipidemia, and increased mean erythrocyte volume, two patients also were hepatitis B surface antigen positive, three anti-hepatitis B core antigen positive and three with CDT )2%). The difference in GGT values between the Nordic Countries group and the other three groups was only marginally significant according to Lahti. Although in each regional population a small subgroup of individuals had higher GGT values, in the Nordic Countries group this subgroup was more numerous and caused a dramatic increase
Figure 2 Distribution of GGT results. (A) Distribution of GGT results for male reference individuals from the four regions. The arrows indicate the 97.5th percentile limits for (1) Scandinavia: 114 U/L, (2) Milan: 69 U/L, (3) Beijing: 64 U/L, and (4) Bursa: 66 U/L. (B) Distribution of GGT results for female reference individuals from the four regions. The arrows indicate the 97.5th percentile limits for (1) Scandinavia: 69 U/L, (2) Milan: 44 U/L, (3) Beijing: 39 U/L, and (4) Bursa: 40.5 U/L.
in the upper reference limit (Figure 2). Nevertheless, we were not able to find biological reasons able to explain this behaviour. Perhaps evaluation of the pattern of GGT subfractions in different groups could help in shedding light on this issue (37, 38). The GGT results for females obtained by merging the other three groups (upper limit of w40 U/L (0.67 mkat/ Lx) compare well with those shown by Schumann and Klauke (38 U/L) (29) and Leino et al. (42 U/L) (17). For males, the upper reference limit for GGT (68 U/L w1.13 mkat/Lx) is higher than that obtained in two studies mentioned previously (55 U/L and 58 U/L, respectively) and also by Steinmetz et al. (49 U/L) (39). However, our GGT data was highly skewed and the 90% confidence interval (CI) around the upper reference limit includes all these previously published limits.
Conclusions Our study demonstrates that for AST and ALT, common RIs are applicable. The pre-requisite is the use of analytical systems providing results that are traceable to IFCC reference
Article in press - uncorrected proof 1600 Ceriotti et al.: AST, ALT and GGT multicenter reference intervals
measurement systems. If this is the case, the proposed RIs can be adopted, replacing those currently in use. The case of GGT appears to be more complex, and a common RI cannot definitively be suggested, even though three out of four regional reference sample groups (Italy, Turkey and China) showed very similar results. Further studies are needed to understand the reasons of these differences found in apparently healthy individuals, not otherwise affected by any of the known causes that increase GGT.
Acknowledgements The authors acknowledge Gerhard Schumann (Hannover Medical School, Hannover, Germany), Elena Guerra (Diagnostica e Ricerca San Raffaele, Milano, Italy), and Ilenia Infusino (CIRME, University of Milano, Italy) for enzyme measurements using reference measurement procedures. Alessandra Ratto, Carlo Ferrero (HSR, Milano, Italy) and Cristina Valente (Luigi Sacco University-Hospital, Milano, Italy) are also acknowledged for the invaluable support in selecting reference individuals and measuring corresponding samples. Kiyoshi Ichihara (Yamaguchi University Graduate School of Medicine, Ube, Yamaguchi, Japan) has supervised data elaboration.
Conflict of interest statement Authors’ conflict of interest disclosure: The authors stated that there are no conflicts of interest regarding the publication of this article. Research funding played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication. Research funding: This study was partially supported by the IFCC. Minor economic support was also given by Abbott Laboratories, Ortho-Clinical Diagnostics Inc., Sysmex Europe GmbH, and The Binding Site Ltd. The Bursa part of the study was supported by The Uludag University Research Fund (Grant UAP(T)-2009/10). Employment or leadership: None declared. Honorarium: None declared.
References 1. Ceriotti F, Henny J. Are my laboratory results normal? Considerations to be made concerning reference intervals and decision limits. eJIFCC 2008. http://www.ifcc.org/ejifcc/vol19no2/ 190201200803.htm. 2. Schumann G, Bonora R, Ceriotti F, Ferard G, Ferrero CA, Franck PF, et al. IFCC primary reference procedures for the measurement of catalytic activity concentrations of enzymes at 37 degrees C. International Federation of Clinical Chemistry and Laboratory Medicine. Part 4. Reference procedure for the measurement of catalytic concentration of alanine aminotransferase. Clin Chem Lab Med 2002;40:718–24. 3. Schumann G, Bonora R, Ceriotti F, Ferard G, Ferrero CA, Franck PF, et al. IFCC primary reference procedures for the measurement of catalytic activity concentrations of enzymes at 37 degrees C. International Federation of Clinical Chemistry and Laboratory Medicine. Part 5. Reference procedure for the measurement of catalytic concentration of aspartate aminotransferase. Clin Chem Lab Med 2002;40:725–33.
4. Schumann G, Bonora R, Ceriotti F, Ferard G, Ferrero CA, Franck PF, et al. IFCC primary reference procedures for the measurement of catalytic activity concentrations of enzymes at 37 degrees C. International Federation of Clinical Chemistry and Laboratory Medicine. Part 6. Reference procedure for the measurement of catalytic concentration of gamma-glutamyltransferase. Clin Chem Lab Med 2002;40:734–8. 5. Clinical and Laboratory Standards Institute (CLSI) Defining, establishing, and verifying reference intervals in the clinical laboratory; approved guideline – third edition CLSI document C28-A3, 2008. 6. Infusino I, Schumann G, Ceriotti F, Panteghini M. Standardization in clinical enzymology: a challenge for the theory of meterological traceability. Clin Chem Lab Med 2010;48:301–7. 7. Ceriotti F, Hinzmann R, Panteghini M. Reference intervals: the way forward. Ann Clin Biochem 2009;46:8–17. 8. Ceriotti F. Prerequisites for use of common reference intervals. Clin Biochem Rev 2007;28:115–21. 9. Ichihara K, Itoh Y, Min WK, Sook FY, Lam CW, Kong XT, et al. Diagnostic and epidemiological implications of regional differences in serum concentrations of proteins observed in six Asian cities. Clin Chem Lab Med 2004;42:800–9. 10. Ichihara K, Itoh Y, Lam CW, Poon PM, Kim JH, Kyono H, et al. Sources of variation for commonly measured serum analytes among 6 Asian cities and consideration of common reference intervals. Clin Chem 2008;54:356–65. 11. Boyd JC. Cautions in the adoption of common reference intervals. Clin Chem 2008;54:238–9. 12. CLSI/NCCLS. Preparation and validation of commutable frozen human serum pools as secondary reference materials for cholesterol measurement procedures; Approved guideline. CLSI document C37-A. Wayne, PA: NCCLS; 1999. 13. Jansen R, Schumann G, Baadenhuijsen H, Franck P, Franzini C, Kruse R, et al. Trueness verification and traceability assessment of results from commercial systems for measurement of six enzyme activities in serum: an international study in the EC4 framework of the Calibration 2000 project. Clin Chim Acta 2006;368:160–7. 14. Rustad P, Simonsson P, Felding P, Pedersen M. Nordic Reference Interval Project Bio-bank and Database (NOBIDA): a source for future estimation and retrospective evaluation of reference intervals. Scand J Clin Lab Invest 2004;64:431–8. 15. Directive 98/79/EC of the European Parliament and of the Council of 27 October 1998 on in vitro diagnostic medical devices. Official Journal of the European Communities 7.12.1998 L331. 16. Solberg HE. The IFCC recommendation on estimation of reference intervals. The RefVal program. Clin Chem Lab Med 2004;42:710–4. 17. Leino A, Impivaara O, Irjala K, Ma¨ki J, Peltola O, Ja¨rvisalo J. Health-based reference intervals for ALAT, ASAT and GT in serum, measured according to the recommendations of the European Committee for Clinical Laboratory Standards (ECCLS). Scand J Clin Lab Invest 1995;55:243–50. 18. Roberts WL, McMillin GA, Burtis CA, Bruns DE. Reference information for the clinical laboratory. In: Burtis CA, Ashwood ER, Bruns DE, editors. Tietz textbook of clinical chemistry and molecular diagnostics, 4th ed. Philadelphia, PA: Elsevier Saunders, 2006:2251–318. 19. Lahti A, Hyltoft Petersen P, Boyd JC, Fraser CG, Jørgensen N. Objective criteria for partitioning Gaussian-distributed reference values into subgroups. Clin Chem 2002;48:338–52. 20. Lahti A, Hyltoft Petersen P, Boyd JC, Rustad P, Laake P, Solberg HE. Partitioning of nongaussian distributed biochemical reference data into subgroups. Clin Chem 2004;50:891–900.
Article in press - uncorrected proof Ceriotti et al.: AST, ALT and GGT multicenter reference intervals 1601
21. Ceriotti F. Reference intervals in the new millenium. Biochim Clin 2007;31:254–66. ¨ nalp A, Creer MH. Influence of local 22. Neuschwander-Tetri BA, U reference populations on upper limits of normal for serum alanine aminotransferase levels. Arch Intern Med 2008; 168:663–5. 23. Fe´rard G, Imbert-Bismut F, Messous D, Piton A, Ueda S, Poynard T, et al. A reference material for traceability of aspartate aminotransferase (AST) results. Clin Chem Lab Med 2005;43: 549–53. 24. Dufour D. Alanine aminotransferase: is healthy to be ‘‘normal’’? Hepatology 2009;50:1699–701. 25. Dutta A, Saha C, Johnson CS, Chalasani N. Variability in the upper limit of normal for serum alanine aminotransferase levels: a statewide study. Hepatology 2009;50:1957–62. 26. Wu A. Standardization of assays for clinically important enzymes that have high biologic variation: what is all the fuss about? Clin Chem Lab Med 2010;48:299–300. 27. Mulhall BP, Ong JP, Younossi ZM. Non-alcoholic fatty liver disease: an overview. J Gastroenterol Hepatol 2002;17: 1136–43. 28. Patt CH, Yoo HY, Dibadj K, Flynn J, Thuluvath PJ. Prevalence of transaminase abnormalities in asymptomatic, healthy subjects participating in an executive health-screening program. Dig Dis Sci 2003;48:797–801. 29. Schumann G, Klauke R. New IFCC reference procedures for the determination of catalytic activity concentrations of five enzymes in serum: preliminary upper reference limits obtained in hospitalized subjects. Clin Chim Acta 2003;327:69–79. 30. Prati D, Taioli E, Zanella A, Della Torre E, Butelli S, Del Vecchio E, et al. Updated definitions of healthy ranges for serum alanine aminotransferase levels. Ann Intern Med 2002;137:1–9. 31. Strømme JH, Rustad P, Steensland H, Theodorsen L, Urdal P. Reference intervals for eight enzymes in blood of adult females and males measured in accordance with the International Fed-
32.
33.
34.
35.
36.
37.
38.
39.
eration of Clinical Chemistry reference system at 378C: part of the Nordic Reference Interval Project. Scand J Clin Lab Invest 2004;64:371–84. Chan AO, Lee KC, Leung JN, Shek CC. Reference intervals of common serum analytes of Hong Kong Chinese. J Clin Pathol 2008;61:632–6. Brinkmann T, Dreier J, Diekmann J, Go¨tting C, Klauke R, Schumann G, et al. Alanine aminotransferase cut-off values for blood donor screening using the new International Federation of Clinical Chemistry reference method at 37 degrees C. Vox Sang 2003;85:159–64. Bock BJ, Dolan CT, Miller GC, Fitter WF, Hertsell BD, Crowson AN, et al. The data warehouse as a foundation for population-based reference intervals. Am J Clin Pathol 2003; 120:662–70. Grossi E, Colombo R, Cavuto S, Franzini C. Age and gender relationships of serum alanine aminotransferase values in healthy subjects. Am J Gastroenterol 2006;101:1675–6. Heiduk M, Pa¨ge I, Kliem C, Abicht K, Klein G. Pediatric reference intervals determined in ambulatory and hospitalized children and juveniles. Clin Chim Acta 2009;406:156–61. Franzini M, Bramanti E, Ottaviano V, Ghiri E, Scatena F, Barsacchi R, et al. A high performance gel filtration chromatography method for g-glutamyltransferase fraction analysis. Anal Biochem 2008;374:1–6. Franzini M, Passino C, Ottaviano V, Fierabracci V, Pompella A, Paolicchi A, et al. Fractions of plasma gamma-glutamyltransferase in healthy individuals: Reference values. Clin Chim Acta 2008;395:188–9. Steinmetz J, Schiele F, Gueguen R, Fe´rard G, Henny J and the Periodic Health Examination Centers Laboratory Working Group. Standardization of g-glutamyltransferase assays by intermethod calibration. Effect on determining common reference limits. Clin Chem Lab Med 2007;45:1373–80.