ADC as a useful diagnostic tool for differentiating

0 downloads 0 Views 1MB Size Report
Jan 10, 2018 - differentiating benign and malignant vertebral bone marrow lesions (BMLs) and ...... MR imaging in the detection of vertebral metastases.
European Radiology https://doi.org/10.1007/s00330-018-5330-5

MUSCULOSKELETAL

ADC as a useful diagnostic tool for differentiating benign and malignant vertebral bone marrow lesions and compression fractures: a systematic review and meta-analysis Chong Hyun Suh 1 & Seong Jong Yun 2

&

Wook Jin 2 & Sun Hwa Lee 3 & So Young Park 2 & Chang-Woo Ryu 2

Received: 6 December 2017 / Revised: 10 January 2018 / Accepted: 12 January 2018 # European Society of Radiology 2018

Abstract Objectives To assess the sensitivity and specificity of quantitative assessment of the apparent diffusion coefficient (ADC) for differentiating benign and malignant vertebral bone marrow lesions (BMLs) and compression fractures (CFs) Methods An electronic literature search of MEDLINE and EMBASE was conducted. Bivariate modelling and hierarchical summary receiver operating characteristic modelling were performed to evaluate the diagnostic performance of ADC for differentiating vertebral BMLs. Subgroup analysis was performed for differentiating benign and malignant vertebral CFs. Metaregression analyses according to subject, study and diffusion-weighted imaging (DWI) characteristics were performed. Results Twelve eligible studies (748 lesions, 661 patients) were included. The ADC exhibited a pooled sensitivity of 0.89 (95% confidence interval [CI] 0.80–0.94) and a pooled specificity of 0.87 (95% CI 0.78–0.93) for differentiating benign and malignant vertebral BMLs. In addition, the pooled sensitivity and specificity for differentiating benign and malignant CFs were 0.92 (95% CI 0.82–0.97) and 0.91 (95% CI 0.87–0.94), respectively. In the meta-regression analysis, the DWI slice thickness was a significant factor affecting heterogeneity (p < 0.01); thinner slice thickness (< 5 mm) showed higher specificity (95%) than thicker slice thickness (81%). Conclusions Quantitative assessment of ADC is a useful diagnostic tool for differentiating benign and malignant vertebral BMLs and CFs. Key Points • Quantitative assessment of ADC is useful in differentiating vertebral BMLs. • Quantitative ADC assessment for BMLs had sensitivity of 89%, specificity of 87%. • Quantitative ADC assessment for CFs had sensitivity of 92%, specificity of 91%. • The specificity is highest (95%) with thinner (< 5 mm) DWI slice thickness. Keywords Meta-analysis . Diffusion-weighted MRI . Spine . Bone marrow neoplasm . Compression fracture Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00330-018-5330-5) contains supplementary material, which is available to authorized users. * Seong Jong Yun [email protected] 1

2

3

Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, 88 Olympic-ro 43-gil, Songpa-gu, Seoul 05505, Republic of Korea

Abbreviations ADC BML BVC CF DWI HSROC QUADAS-2

Apparent diffusion coefficient Bone marrow lesion Best value comparator Compression fracture Diffusion-weighted imaging Hierarchical summary receiver operating characteristic Quality Assessment of Diagnostic Accuracy Studies-2

Department of Radiology, Kyung Hee University Hospital at Gangdong, Kyung Hee University School of Medicine, 891 Dongnam-ro, Gangdong-gu, Seoul 05278, Republic of Korea

Introduction

Department of Emergency Medicine, Sanggye Paik Hospital, Inje University College of Medicine, 1342 Dongil-ro, Nowon-gu, Seoul 01757, Republic of Korea

Differentiating between benign and malignant vertebral marrow pathologies is challenging because fractures of both

Eur Radiol

benign and malignant causes result in similar changes in signal intensity, except that the fat-containing spinal bone marrow is displaced by tumour cells in a malignant lesion [1, 2]. Differentiating between acute benign and malignant vertebral bone marrow lesions (BMLs) is particularly important in elderly patients with no or minor trauma history. Although conventional magnetic resonance imaging (MRI) methods have been successfully used [3, 4], an accurate diagnosis using these tools remains difficult. Previous studies [5–7] have proposed diffusion-weighted imaging (DWI) as a suitable method for differentiating between benign and malignant marrow pathologies. However, simple qualitative DWI data analyses raise the question of whether the T2 shine-through effect may have contributed to the appearance of such images [8]. Because of this limitation, some investigators have quantified the diffusion in abnormal vertebrae based on the apparent diffusion coefficient (ADC) value and concluded that quantitative assessment is more useful than qualitative assessment in differentiating benign from malignant vertebral BMLs [9, 10]. Nevertheless, this method has not been commonly used in this context because of some controversy related to overlapping ADC values in clinical practice. Although a few metaanalyses on DWI of vertebral lesions have been conducted [11–13], most did not systematically review the role of quantitative ADC assessment. Therefore, we emphasize the need to explore the relative performance of the quantitative assessment of ADC in DWI for differentiating between benign and malignant BMLs, and compression fractures (CFs) should be fully explored and presented as high-level evidence through quantitative synthesis of the existing studies. Pooling previous results is of particular interest to radiologists because of the different magnets and methodological settings such as DWI sequences and b values. Our main aim in the present metaanalysis was to assess the diagnostic performance of the quantitative assessment of ADC on DWI for differentiating benign and malignant BMLs and CFs.

Methods This meta-analysis followed the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement [14].

Data sources We conducted online literature searches of the MEDLINE and EMBASE databases up to September 2017 to identify relevant studies that quantitatively assessed ADC for vertebral BMLs. Search terms related to Bvertebral lesions^ or Bvertebral fractures^ were combined with BDWI^ or BADC^ as follows: ((vertebral CFs) OR (benign vertebral fractures) OR

(benign vertebral lesions) OR (malignant vertebral fractures) OR (malignant vertebral lesions)) AND ((diffusion weighted imaging) OR (diffusion-weighted) OR (DWI) OR (apparent diffusion coefficient) OR (ADC)). The bibliographies of the identified articles were screened to identify additional relevant studies. The search was limited to English-language studies. Two investigators (C.H.S. and S.J.Y.) screened the titles and abstracts for potential eligibility, and disagreements were resolved through discussion.

Study selection We included studies that met the following criteria: (1) patients with vertebral BMLs; (2) quantified ADC in DWI used as the index test for differentiating between benign and malignant vertebral BMLs; (3) histopathology or best value comparator (BVC; the most reliable follow-up findings for comparing initial findings) used as the reference standard and (4) Boriginal article^ publication type. The exclusion criteria were as follows: (1) case reports or series; (2) review articles, guidelines, consensus statements, letters, editorials and conference abstracts; (3) studies not relevant to our aim; (4) DWI used for BMLs but without quantitative evaluation; (5) insufficient data for the reconstruction of 2 × 2 tables regarding sensitivity and specificity and (6) overlapping patient populations. In cases of overlapping study populations, the study with the largest study population was included. The authors of the studies were contacted for provision of further information when 2 × 2 tables could not be reconstructed.

Data extraction and quality assessment Two investigators (C.H.S. and S.J.Y.) independently extracted the patient and study protocol characteristics and appraised the methodological quality of the included studies by using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool [15]. Inconsistencies between the reviewers were resolved by consensus. The following data were extracted from the selected studies by using a standardized form: (1) patients characteristics, including the number of patients, number of lesions, percentage of malignancy, clinical features, mean age, age range and sex; (2) study characteristics, including the origin of the study, publication year, study design, reference standard, interval between MRI and reference standard, blinding to reference standard and characteristics of readers and (3) MRI characteristics, including the scanner, technical parameters and interpretation. We extracted the study outcomes for 2 × 2 tables (i.e. true positive, true negative, false positive and false negative test results). We calculated the 2 × 2 tables using the Bayesian method only if the sensitivity and specificity were presented

Eur Radiol Fig. 1 Flow diagram showing the study selection process for the meta-analysis

in an eligible study. If two or more reviewers independently assessed the diagnostic accuracy, the result with the highest accuracy was extracted.

Data synthesis and analysis The patients’ demographic characteristics and extracted covariates were summarized using standard descriptive statistics. When the information was available, studies were stratified according to those that included vertebral CFs and those Table 1

Characteristics of the subjects

Author

No. of patients

Wonglaksanapimon et al. [47]

No. of lesions

Malignant (%)

Clinical features

Mean age (years) (range)

Percentage of male (%)

Osteoporotic, traumatic, infectious CFs vs. malignant CFs Benign CFs with tumours vs. malignant CFs with tumours Benign BM lesions vs. malignant BM lesions Osteoporotic CFs vs. malignant CFs Infectious spondylitis vs. malignant BM lesions Benign BM lesions vs. malignant BM lesions Nodular hyperplastic hematopoietic BM vs. malignant BM lesions Traumatic CFs vs. tumour infiltration with/without malignant CFs Osteoporotic CFs vs. malignant CFs Osteoporotic, traumatic, infectious CFs vs. malignant CFs Osteoporotic CFs vs. malignant CFs Osteoporotic, traumatic, infectious CFs vs. malignant CFs

54.6 (35–84)

36.4

55.0 (19–89)

51.0

54.0 (25–80) 68.7 (34–84) 34.1 (2–72) 59.5 (14–85) 62.2 (35–84)

60.0 43.5 51.0 53.4 48.5

65.0 (23–87)

56.5

66.6 (25–86) 61.0 (25–94)

41.3 63.3

54.7 (35–63) 58.5 (22–87)

41.5 62.0

22

39

18.0

Taşkin et al. [48]

104

133

55.6

Tadros et al. [49] Sung et al. [50] Pui et al. [51] Pozzi et al. [52] Park S et al. [53]

30 62 51 116 33

30 62 128 116 58

53.3 48.4 39.1 86.3 84.5

Park HJ et al. [54]

46

86

51.2

Geith et al. [55] Fawzy et al. [56]

46 60

46 60

43.5 20.0

Abowarda et al. [57] Abo Dewan et al. [58]

41 50

68 96

44.1 42.7

CF compression fracture, BM bone marrow

that did not. Continuous variables were expressed as means and 95% CIs, and categorical variables were expressed as frequencies or percentages, unless stated otherwise. We used a bivariate random-effects model for the analysis and pooling of the diagnostic performance (sensitivity, specificity) measures across studies. To derive summary estimates of the diagnostic test performance, we plotted estimates of the observed sensitivities and specificities for each test in forest plots and hierarchical summary receiver operating characteristic (HSROC) curves derived from the individual study

Prospective, consecutive 2014 South Korea 2011.3–2012.12 Retrospective, consecutive 2005 Canada NR Prospective, consecutive

2017 Italy

Pozzi et al. [52]

2013.6–2015.1

Details of the BVC

BVC or histopathology Follow-up MR examination or clinical follow-up BVC or histopathology Clinical, radiological, laboratory findings

BVC or histopathology PET/CT, follow-up MR/CT examinations Histopathology

BVC or histopathology PEC/CT, follow-up MR examination BVC or histopathology Follow-up MR examinations

BVC or histopathology Bone scintigraphy, conventional MR sequences, clinical follow-up, laboratory findings BVC or histopathology Follow-up clinical, radiological findings BVC or histopathology Follow-up MR imaging, BVC or histopathology Microbiology, clinical response to antimicrobial therapy Histopathology

BVC or histopathology Clinical, radiological findings

Reference standard

BVC best value comparator, MR magnetic resonance, NR not reported, PET positron emission tomography, CT computed tomography

2010.1–2016.11 Retrospective, consecutive Park S et al. [53] 2017 South Korea 2015.4–2016.1 Retrospective, consecutive Park HJ et al. [54] 2016 South Korea 2011.8–2012.7 Retrospective, consecutive Geith et al. [55] 2014 Germany NR Prospective, consecutive Fawzy et al. [56] 2013 Egypt 2011.9–2012.7 Retrospective, consecutive Abowarda et al. [57] 2017 Egypt 2014.7–2016.2 Retrospective, consecutive Abo Dewan et al. [58] 2015 Egypt 2013.7–2014.9 Retrospective, consecutive

Pui et al. [51]

Sung et al. [50]

2016 Egypt

Tadros et al. [49]

Prospective, consecutive 2009.11–2011.2 Prospective, consecutive

2009.1–2010.3

2013 Turkey

Study design

2012 Thailand

Study period

Wonglaksanapimon et al. [47] Taşkin et al. [48]

Locale

Year

Characteristics of the studies

Author

Table 2

NR NR

NR NR

NR

NR

2, consensus

1

1

Blinding 2, consensus

NR

4

NR

12, 8

Blinding 2, independent 12, 9

NR NR

Blinding 2, independent 12, 7

Blinding 2, independent 10, 5

NR

NR

16

NR

NR

NR

NR

2, consensus

Blinding 3, independent 15, 4, 1

NR

NR

NR

Reader experience (years)

NR

NR

NR

NR

NR

Blinding 2, consensus

NR

Time from MR Blinding No. of readers to reference standard (days)

Eur Radiol

Siemens Avanto

Philips

Philips

Siemens Magnetom Avanto

Philips

Pozzi et al. [52]

Park S et al. [53]

Park HJ et al. [54]

Geith et al. [55]

Fawzy et al. [56]

T1WI, T2WI, T1CE

T1WI, T2WI, STIR, T1CE with FS

1.5 1.5

Signa

1.5

1.5

T1WI, T2WI, T1CE with FS T1WI, T2WI, T1CE

T1WI, T2WI, STIR

T1WI, T2WI, STIR

2

2

2

1

1

2

1

2

2

800

500

400

5

5

2

Single shot SE EPI Single shot SE EPI Single shot TSE Multi-shot SE EPI Single shot SE EPI Single shot SE EPI

5

1000

1000

500, 800

5

3

6

100, 250, 400 5

0, 400, 1000 4

0, 800

Single shot 0, 800, 1400 3 SE EPI Single shot 1000 7 SE EPI Spin echo EPI 0, 1000 5

Single shot SE EPI

2

T1WI, T2WI, STIR, T1CE with FS, chemical shift T1WI, T2WI, T1CE with FS T1WI, T2WI

Single shot SE EPI Single shot SE EPI

T1WI, T2WI, STIR, 2 T1CE with FS T1WI, T2WI, STIR, T1CE 2

3.0, 1.5 T1WI, T2WI

3.0

Achieva

NR

Intera, Achieva

Achieva

1.5

1.5

3.0

1.5

1.5

3.0

High SI on DWI, low value on ADC map

High SI on DWI

High SI on STIR, low SI on T1WI NR

NR

High SI on the T2WI, low SI on T1WI, and enhancement on T1CE B0 images and T1WI

0.89

5–15

0.952

NR

NR

NR

NR

4–10.4

1.21

1.48

1.15

1.7

1.14

10.5–51.7 0.695

NR

1.02

1.031

0.67

0.19–2.60 1.32

NR

ROI size ADC cut-off (mm) value (× 10−3) (mm2/s)

High SI on the NR DWI with a high b value The brightest SI on the DWI NR

High SI on the DWI

Centre of abnormal low value on ADC map The brightest SI on the DWI

Minimum Determination b values of the ROI slice used for ADC (s/mm2) thickness placement (mm)

Interpretation

DWI diffusion weighted imaging, ADC apparent diffusion coefficient, ROI region of interest, T1WI T1-weighted image, T2WI T2-weighted image, STIR short tau inversion recovery, T1CE contrastenhanced T1-weighted image, FS fat suppression, SE spin echo, EPI echo planar imaging, TSE turbo spin echo, SI signal intensity, NR not reported

Abowarda Philips et al. [57] Abo Dewan et al. [58] GE

GE

Pui et al. [51]

Signa Echospeed

Siemens Magnetom Verio

Sung et al. [50]

Tadros et al. [49]

Achieva

Siemens Magnetom Symphony Quantum Philips Achieva

Philips

No. of DWI image sequence planes

Vendor Model

Magnet Conventional strength MR sequences (T)

Technical parameters

Scanner

Magnetic resonance imaging characteristics

Wonglaksanapimon et al. [47] Taşkin et al. [48]

Author

Table 3

Eur Radiol

Eur Radiol

Fig. 2 Grouped bar charts showing the risk of bias (left) and concern for applicability (right) of the 12 included studies, using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 domains

results [16–18]. These results were plotted using HSROC curves with 95% confidence and prediction regions.

Heterogeneity was determined using Cochran’s Q test (p < 0.05 indicating the presence of heterogeneity) and the

Fig. 3 Coupled forest plots of pooled sensitivity and specificity for differentiating vertebral bone marrow lesions. Dots in squares represent sensitivity and specificity. Horizontal lines represent 95% confidence interval (CI) for each included study. Combined estimate, means

BSummary^, is based on random-effects model and is indicated by diamonds. Corresponding heterogeneities (I 2 ) with 95% CIs are provided in the bottom right corners [I2 = 100% × (Q − df)/Q, where Q is Cochran’s heterogeneity statistic and df is the degrees of freedom]

Eur Radiol

I2 test (0–40%, heterogeneity might not be important; 30– 60%, moderate heterogeneity may be present; 50–90%, substantial heterogeneity may be present; and 75–100%, considerable heterogeneity [19]). In addition, the Spearman correlation coefficient between the sensitivity and false positive rate for the presence of a threshold effect was calculated. A Spearman correlation coefficient of greater than 0.6 was considered to indicate a considerable threshold effect [20]. To check for publication bias, we created Deeks’ funnel plots of individual study log odds ratios plotted against the sample size [21].

Meta-regression Meta-regression analyses were performed to explore the cause of heterogeneity using the following covariates: (1) study design (prospective vs. retrospective); (2) locale (Egypt vs. other countries); (3) percentage of malignancy (≥ 50% vs. < 50%); (4) mean age (≥ 59 years vs. < 59 years); (5) magnet field strength (used only 3 T vs. used only 1.5 T); (6) number of image planes (two image planes vs. one image plane); (7) DWI sequence (DW-echo planar imaging vs. DW-turbo spin echo); (8) minimum DWI thickness (≥ 5 mm vs. < 5 mm); (9) highest b value for ADC measurement (1000 vs. < 1000); (10) cut-off value (≥ 1.1 vs. < 1.1); (11) blinding (blinding vs. not reported) and (12) QUADAS-2 score (≥ 4 vs. < 4). In addition, sensitivity analyses were performed for the various settings stratified to the aforementioned covariates. All statistical analyses were performed by one of the authors (C.H.S., with 4 years of experience in performing systematic reviews and meta-analyses). The Bmidas^ and Bmetandi^ modules in Stata 10.0 (StataCorp LP, College Station, TX, USA) and Bmada^ package in R software version 3.4.1 (R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analyses, with p < 0.05 indicating statistical significance.

= 19) [5, 6, 9–11, 22–35], no quantitative assessment of ADC (n = 9) [1, 8, 36–42] and overlapping study populations (n = 4) [43–46]. Ultimately, 12 studies [47–58] that evaluated the diagnostic performance of DWI for differentiating between benign and malignant vertebral BMLs, comprising 748 lesions in 661 patients, were included in this meta-analysis.

Characteristics of the studies and included patients The characteristics of the study patients are summarized in Table 1. The number of patients ranged from 22 to 116. The number of lesions ranged from 30 to 133 (mean age, 34.1– 68.7 years), with a 36.4–63.3% mean percentage of male patients, with the percentage of patients with malignant vertebral BMLs ranging from 18.0% to 86.3%. The study characteristics are summarized in Table 2. The study design was prospective in five studies [47–49, 51, 55]

Results Literature search A flow diagram summarizing the literature search is presented in Fig. 1. A total of 166 studies were identified during the initial search. After removing 66 duplicates, we reviewed the remaining 100 titles and abstracts, and 56 studies of the following types were excluded: review articles/guidelines/consensus/ statements (n = 21), case reports/letters/editorials/abstracts (n = 18) and those not in the field of interest (n = 17). After assessing the full-texts of 44 eligible articles, we excluded 32 studies for the following reasons: insufficient data to reconstruct 2 × 2 tables despite quantitative assessment of ADC (n

Fig. 4 Hierarchical summary receiver operating characteristic (HSROC) curve of the diagnostic performance of apparent diffusion coefficient for differentiating between benign and malignant vertebral bone marrow lesions. The summary point (red box) indicates that summary sensitivity was 0.89 (95% confidence interval [CI] 0.80–0.94) and the summary specificity was 0.87 (95% CI 0.78–0.93). The 95% confidence region represents 95% CIs of summary sensitivity and specificity, and 95% prediction region represents 95% CI of sensitivity and specificity of each individual study included in analysis. Study estimates indicate sensitivity and specificity estimated using data from each study separately. The size of the marker is scaled according to the total number of patients in each study. There was a large difference between the 95% confidence and the 95% prediction regions, indicating the possibility of the presence of heterogeneity between the studies

Eur Radiol

and retrospective in seven [50, 52–54, 56–58]. All studies were single-centred and performed patient recruitment consecutively. Ten studies [47–51, 53–55, 57, 58] used either histopathology or BVC as the reference standard, whereas two [52, 56] used only histopathology. The MRI characteristics are summarized in Table 3. Three studies [47, 50, 53] used only 3-T scanners, eight [48, 49, 51, 52, 55–58] used only 1.5-T scanners and one [54] used either 1.5-T or 3-T scanners. Regarding the conventional MRI sequences, six studies [47–49, 52, 55, 56] used short tau inversion recovery and eight [47–50, 52, 53, 57, 58] used contrast-enhanced T1-weighted imaging (T1WI) with or without suppression adjunct to T1WI and T2-weighted imaging (T2WI); two studies [51, 54] used only T1WI and T2WI. Nine studies [47–51, 53, 56–58] used two image planes and three [52, 54, 55] used only one (axial or sagittal). Most studies (n = 11) [47–54, 56–58] used the echo planar imaging sequence for DWI and only one [55] used the turbo spin echo sequence.

The sensitivity and specificity of the 12 included studies ranged from 0.66 to 1.0 and 0.55 to 1.00, respectively. When all studies were combined, the pooled sensitivity was 0.89 (95% CI 0.80–0.94) with a specificity of 0.87 (95% CI 0.78–0.93). The Q test revealed no heterogeneity (Q = 2.777, p = 0.13). Substantial heterogeneity was shown in the sensitivity (I2 = 80.20%) and specificity (I2 = 92.59%). A threshold effect was shown not only by visual analysis of the coupled forest plot of sensitivity and specificity (Fig. 3) but also by a corresponding correlation coefficient of −0.518 (lower range of 95% CI, −0.842; upper range of 95% CI, 0.079) between sensitivity and false positive rate. The area under the HSROC

Fig. 5 Coupled forest plots of pooled sensitivity and specificity for differentiating vertebral compression fractures. Dots in squares represent sensitivity and specificity. Horizontal lines represent 95% confidence interval (CI) for each included study. Combined estimate,

means BSummary^, is based on random-effects model and is indicated by diamonds. Corresponding heterogeneities (I2) with 95% CIs are provided in the bottom right corners [I2 = 100% × (Q − df)/Q, where Q is Cochran's heterogeneity statistic and df is the degrees of freedom]

Figure 2 shows the risk of bias and applicability concerns for the 12 included studies. Overall, no studies were considered to be seriously flawed according to the QUADAS-2 tool.

Diagnostic performance for differentiating vertebral BMLs

Eur Radiol

curve was 0.94 (95% CI 0.92–0.96; Fig. 4). The Deeks’ funnel plot and asymmetry test (p = 0.98 for the slope coefficient) both indicated that publication bias was not affected in our meta-analysis (Supplemental Fig. S1).

Diagnostic performance for differentiating vertebral CFs We identified eight studies [45, 46, 48, 52–56] that performed quantitative analysis of DWI for differentiating benign and malignant CFs. The pooled sensitivity was 0.92 (95% CI 0.82–0.97) with a specificity of 0.91 (95% CI 0.87–0.94). The Q test revealed no heterogeneity (Q = 3.651, p = 0.08). The Higgins I2 statistics revealed that heterogeneity is not important in terms of specificity (I2 = 37.73%); however, substantial heterogeneity was shown in terms of sensitivity (I2 = 79.22%). A threshold effect was shown not only by visual analysis of the coupled forest plot of sensitivity and specificity (Fig. 5) but also by a corresponding correlation coefficient of − 0.297 (lower range of 95% CI, − 0.828; upper range of 95% CI, 0.516) between sensitivity and false positive rate. The area under the HSROC curve was 0.95 (95% CI 0.93–0.97; Fig. 6). The Deeks’ funnel plot and asymmetry test (p = 0.53 for the slope coefficient) both indicated that publication bias was not affected in our meta-analysis (Supplemental Fig. S2).

Discussion Our meta-analysis showed that the diagnostic performance of the quantitative assessment of ADC was excellent for differentiating between both benign and malignant BMLs (sensitivity 89%, specificity 87%) and CFs (sensitivity 92%, specificity 91%). Our results have crucial clinical implications because they indicate that the ADC value can be used to characterize tissues. Previous studies [5–7] reported that the ADC value quantifies water proton motion, which is a combination of true water diffusion and capillary perfusion in biological tissues. Thus, it is used to differentiate between malignant and benign tissues as a result of the different degree of diffusion, which is related to the cellular density and extracellular space volume [5–7]. Malignant tumours are characterized by low ADC values as a result of high cell density and low extracellular

Meta-regression analyses The results of the meta-regression analyses are summarized in Table 4. Among the variables that were considered a potential source of heterogeneity, the locale (Egypt vs. other countries; p = 0.04) and slice thickness of DWI (≥ 5 mm vs. < 5 mm; p < 0.01) were significant factors. Specifically, studies from Egypt (n = 4) reported a significantly higher sensitivity (0.93 [95% CI 0.86–1.0]) compared with those from other countries (0.86 [95% CI 0.77–0.94]); however, the pooled specificity estimates were not significantly different (0.89 [95% CI 0.77–1.0] vs. 0.87 [95% CI 0.78–0.96]; p = 0.32). Regarding the slice thickness of DWI, studies using a thickness < 5 mm showed a higher specificity (0.95 [95% CI 0.90– 1.0]) compared with those using a thickness ≥ 5 mm (0.81 [95% CI 0.71–0.92]); however, the pooled sensitivity estimates were not significantly different (0.87 [95% CI 0.75–1.0] vs. 0.88 [95% CI 0.80–0.96]; p = 0.32). Other factors, including the study design, percentage of malignancy, mean age, magnet strength, image plane, DWI sequence, highest b value used for ADC measurement, cutoff value, blinding and QUADAS-2 score, did not significantly affect heterogeneity.

Fig. 6 Hierarchical summary receiver operating characteristic (HSROC) curve of the diagnostic performance of apparent diffusion coefficient for differentiating between benign and malignant vertebral compression fractures. The summary point (red box) indicates that summary sensitivity was 0.92 (95% confidence interval [CI], 0.82–0.97) and the summary specificity was 0.91 (95% CI 0.87–0.94). The 95% confidence region represents 95% CIs of summary sensitivity and specificity, and 95% prediction region represents 95% CI of sensitivity and specificity of each individual study included in analysis. Study estimates indicate sensitivity and specificity estimated using data from each study separately. The size of the marker is scaled according to total number of patients in each study. There was a relatively small difference between the 95% confidence and the 95% prediction regions, indicating the low possibility of the presence of heterogeneity between the studies

Eur Radiol Table 4

Results of meta-regression analyses

Covariate

No. of studies

Sensitivity (95% CI)

Study design

Specificity (95% CI)

0.28

Prospective

5

0.88 (0.78–0.99)

Retrospective Locale

7

0.89 (0.80–0.97)

Country other than Egypt Egypt

8 4

0.18 0.89 (0.80–0.97)

0.86 (0.77–0.94) 0.93 (0.86–1.00)

0.32 0.87 (0.78–0.96) 0.89 (0.77–1.00)

0.07

≥ 50% < 50%

5 7

0.85 (0.74–0.97) 0.91 (0.83–0.98)

Mean age (years) ≥ 59

6

0.87 (0.76–0.97)

< 59

6

0.90 (0.82–0.98)

0.12 0.84 (0.71–0.97) 0.89 (0.81–0.97)

0.13

0.20 0.86 (0.75–0.97) 0.89 (0.79–0.98)

0.79

0.25

3

0.92 (0.81–1.00)

0.95 (0.88–1.00)

Used only 1.5 T No. of image planes

8

0.89 (0.81–0.96)

0.83 (0.74–0.93)

Two planes

9

0.91 (0.85–0.97)

One plane DWI sequence Turbo spin echo Echo planar imaging

3

0.76 (0.58–0.95)

Minimum slice thickness of DWI ≥ 5 mm < 5 mm High b value for ADC measurement ≥ 1000 < 1000 Cut-off value ≥ 1.1 < 1.1 Blinding Blinding Not reported QUADAS-2 score ≥4 0.05), caution is needed when applying ADC values, which depend on various technical factors [61, 62]. Therefore, a single value for accuracy cannot be recommended on the basis of this meta-analysis. In our meta-analysis, the diagnostic performance of DWI for differentiating between benignity and malignancy was based

Eur Radiol

on the mean ADC value only. However, few studies evaluated not only the sole diagnostic accuracy of the quantitative assessment of ADC but also the added value of the quantitative assessment of ADC for between differentiating benign and malignant CFs. Only two studies reported increased accuracy when combining quantitative assessment of ADC with conventional MRI feature analysis based on T1WI and T2WI, and/or contrast-enhanced sequences [48, 52]. In general, conventional MRI sequences with DWI (including ADC map) are used for patients with suspected CFs. Thus, a comprehensive and more advanced methodology (diagnostic accuracy of the quantitative assessment of ADC vs. added value of the quantitative assessment of ADC) may be needed for the ADC value to be applied in routine clinical practice. One of the major limitations of this meta-analysis was the relatively small number of included studies. Many studies [5, 6, 9–11, 22–35] were excluded because they did not assess the diagnostic test accuracy, and thus did not calculate the sensitivity and specificity. However, those studies demonstrated that the mean ADC values differed significantly between benign and malignant vertebral lesions, and are thus in agreement with our meta-analysis results. Also, we used the summary line method of the bivariate random-effects model because of the small number of studies with different cut-off values instead of the summary point method [18]. Second, methodological differences were observed in the included studies. The included studies were heterogeneous in their design (prospective vs. retrospective), eligibility of the included patients and decision criteria. Although the statistical analysis of heterogeneity in effect sizes indicated homogeneity among the studies, the methodological diversity has contributed to the misinterpretation of the pooled estimates. In addition, we evaluated only univariate meta-regression analyses to explore the cause of heterogeneity without adjusting multiplicity. Because of the multiplicity of factors affecting diagnostic accuracy of ADC, we could not identify precise technical parameters and measurements in our meta-analysis. Nevertheless, on the basis of the limited data available, we propose that quantitative assessment of ADC is useful in differentiating benign and malignant vertebral BMLs. Third, although we performed Deeks’ funnel plots and revealed low possibility of publication bias (p > 0.05), the enrolled studies were relatively small and many studies (n = 9) showed positive results. Indeed, studies reporting positive results are more likely to be published than those with neutral or negative results, which make it impossible to estimate the magnitude of publication bias. However, since the direction of such a bias would normally be in favour of the ADC measurement, whether or not it occurred would not undermine the results presented here. Fourth, we enrolled four articles from Egypt out of 12 (25%), which were not easily obtainable (found only in EMBASE), and studies with the most items were not reported, which may have affected the study’s quality (low QUADAS-2

score). In the meta-regression analysis, except for the four articles from Egypt, the pooled estimates were still excellent (sensitivity 86%, specificity 87%). Additionally, the pooled estimates (sensitivity 91%, specificity 90%) of studies with QUADAS-2 score < 4 were not significantly different from those with QUADAS-2 score ≥ 4. Thus, the locale and study quality did not undermine the reliability of our results. Finally, the diffusion characteristics of benign vertebral fractures, such as osteoporosis, trauma and infection, were not fully investigated separately. Additional studies with a larger sample size would certainly be useful to determine the ADC cut-off values, sensitivity, specificity, and positive and negative predictive values of ADC in this aspect. In conclusion, the quantitative assessment of ADC demonstrated excellent diagnostic performance for differentiating between benign and malignant vertebral BMLs and CFs. Quantitative assessment of the ADC using a thinner (< 5 mm) slice thickness is recommended for more accurate measurement of the mean ADC value. Funding The authors state that this work has not received any funding.

Compliance with ethical standards Guarantor The scientific guarantor of this publication is Seong Jong Yun, MD. Conflict of interest The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article. Statistics and biometry One of the authors (Chong Hyun Suh, MD) has significant statistical expertise. Informed consent Written informed consent was not required for this study because the nature of our study was a systemic review and metaanalysis. Ethical approval Institutional review board approval was not required because the nature of our study was a systemic review and meta-analysis. Methodology • Meta-analysis performed at one institution

References 1.

2.

Oztekin O, Ozan E, Hilal Adibelli Z, Unal G, Abali Y (2009) SSHEPI diffusion-weighted MR imaging of the spine with low b values: is it useful in differentiating malignant metastatic tumor infiltration from benign fracture edema? Skeletal Radiol 38:651–658 Kim YP, Kannengiesser S, Paek MY et al (2014) Differentiation between focal malignant marrow-replacing lesions and benign red marrow deposition of the spine with T2*-corrected fat-signal fraction map using a three-echo volume interpolated breath-hold gradient echo Dixon sequence. Korean J Radiol 15:781–791

Eur Radiol 3.

4.

5.

6.

7. 8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

Jung HS, Jee WH, McCauley TR, Ha KY, Choi KH (2003) Discrimination of metastatic from acute osteoporotic compression spinal fractures with MR imaging. RadioGraphics 23:179–187 Shih TT, Huang KM, Li YW (1999) Solitary vertebral collapse: distinction between benign and malignant causes using MR patterns. J Magn Reson Imaging 9:635–642 Herneth AM, Philipp MO, Naude J et al (2002) Vertebral metastases: assessment with apparent diffusion coefficient. Radiology 225: 889–894 Baur A, Stäbler A, Brüning R et al (1998) Diffusion-weighted MR imaging of bone marrow: differentiation of benign versus pathologic compression fractures. Radiology 207:349–356 Castillo M (2003) Diffusion-weighted imaging of the spine: is it reliable? AJNR Am J Neuroradiol 24:1251–1253 Castillo M, Arbelaez A, Smith JK, Fisher LL (2000) Diffusionweighted MR imaging offers no advantage over routine noncontrast MR imaging in the detection of vertebral metastases. AJNR Am J Neuroradiol 21:948–953 Zhou XJ, Leeds NE, McKinnon GC, Kumar AJ (2002) Characterization of benign and metastatic vertebral compression fractures with quantitative diffusion MR imaging. AJNR Am J Neuroradiol 23:165–170 Chan JH, Pen WC, Tsui EY et al (2002) Acute vertebral body compression fractures: discrimination between benign and malignant causes using apparent diffusion coefficients. Br J Radiol 75: 207–214 Tang G, Liu Y, Li W, Yao J, Li B, Li P (2007) Optimization of b value in diffusion-weighted MRI for the differential diagnosis of benign and malignant vertebral fractures. Skeletal Radiol 36:1035–1041 Luo Z, Litao L, Gu S et al (2016) Standard-b-value vs low-b-value DWI for differentiation of benign and malignant vertebral fractures: a meta-analysis. Br J Radiol 89:20150384 Thawait SK, Marcus MA, Morrison WB, Klufas RA, Eng J, Carrino JA (2012) Research synthesis: what is the diagnostic performance of magnetic resonance imaging to discriminate benign from malignant vertebral compression fractures? Systematic review and meta-analysis. Spine (Phila Pa 1976) 37:E736–E744 Liberati A, Altman DG, Tetzlaff J et al (2009) The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med 6:e1000100 Whiting PF, Rutjes AW, Westwood ME et al (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155:529–536 Suh CH, Park SH (2016) Successful publication of systematic review and meta-analysis of studies evaluating diagnostic test accuracy. Korean J Radiol 17:5–6 Kim KW, Lee J, Choi SH, Huh J, Park SH (2015) Systematic review and meta-analysis of studies evaluating diagnostic test accuracy: a practical review for clinical researchers—part I. General guidance and tips. Korean J Radiol 16:1175–1187 Lee J, Kim KW, Choi SH, Huh J, Park SH (2015) Systematic review and meta-analysis of studies evaluating diagnostic test accuracy: a practical review for clinical researchers—part II. Statistical methods of meta-analysis. Korean J Radiol 16: 1188–1196 Higgins J, Green S. Cochrane handbook for systematic reviews of interventions. Version 5.1.0. The Cochrane Collaboration. http:// handbook.cochrane.org/chapter_9/9_5_2_identifying_and_ measuring_heterogeneity.htm. Accessed 30 Sept 2017 Deville WL, Buntinx F, Bouter LM et al (2002) Conducting systematic reviews of diagnostic studies: didactic guidelines. BMC Med Res Methodol 2:9 Deeks JJ, Macaskill P, Irwig L (2005) The performance of tests of publication bias and other sample size effects in

systematic reviews of diagnostic test accuracy was assessed. J Clin Epidemiol 58:882–893 22. Balliu E, Vilanova JC, Peláez I et al (2009) Diagnostic value of apparent diffusion coefficients to differentiate benign from malignant vertebral bone marrow lesions. Eur J Radiol 69:560–566 23. Bammer R, Herneth AM, Maier SE et al (2003) Line scan diffusion imaging of the spine. AJNR Am J Neuroradiol 24:5–12 24. Baur A, Huber A, Ertl-Wagner B et al (2001) Diagnostic value of increased diffusion weighting of a steady-state free precession sequence for differentiating acute benign osteoporotic fractures from pathologic vertebral compression fractures. AJNR Am J Neuroradiol 22:366–672 25. Bhugaloo A, Abdullah B, Siow Y, Ng K (2006) Diffusion weighted MR imaging in acute vertebral compression fractures: differentiation between malignant and benign causes. Biomed Imaging Interv J 2:e12 26. Biffar A, Sourbron S, Dietrich O et al (2010) Combined diffusionweighted and dynamic contrast-enhanced imaging of patients with acute osteoporotic vertebral fractures. Eur J Radiol 76:298–303 27. Geith T, Biffar A, Schmidt G et al (2015) Physiological background of differences in quantitative diffusion-weighted magnetic resonance imaging between acute malignant and benign vertebral body fractures: correlation of apparent diffusion coefficient with quantitative perfusion magnetic resonance imaging using the 2-compartment exchange model. J Comput Assist Tomogr 39:643–648 28. Lin F, Lei Y, Li YB (2009) Influence of lesion ratio on diagnostic performance of in-phase/opposed-phase imaging and apparent diffusion coefficient for differentiating acute benign vertebral fractures and metastases. Chin Med J (Engl) 122:1293–1299 29. Maeda M, Sakuma H, Maier SE, Takeda K (2003) Quantitative assessment of diffusion abnormalities in benign and malignant vertebral compression fractures by line scan diffusion-weighted imaging. AJR Am J Roentgenol 181:1203–1209 30. Martel Villagrán J, Bueno Horcajadas Á, Pérez Fernández E, Martín Martín S (2015) Accuracy of magnetic resonance imaging in differentiating between benign and malignant vertebral lesions: role of diffusion-weighted imaging, in-phase/opposed-phase imaging and apparent diffusion coefficient. Radiologia 57:142–149 31. Pozzi G, Garcia Parra C, Stradiotti P, Tien TV, Luzzati A, Zerbi A (2012) Diffusion-weighted MR imaging in differentiation between osteoporotic and neoplastic vertebral fractures. Eur Spine J 21: S123–S127 32. Rumpel H, Chan LL, Chan LP, Png MA, Tan RK, Lim WE (2006) Vertebrae adjacent to spinal bone lesion are inconsistent reference markers: a magnetic resonance spectroscopic viewpoint. J Magn Reson Imaging 23:574–577 33. Rumpel H, Chong Y, Porter DA, Chan LL (2013) Benign versus metastatic vertebral compression fractures: combined diffusionweighted MRI and MR spectroscopy aids differentiation. Eur Radiol 23:541–550 34. Spuentrup E, Buecker A, Adam G, van Vaals JJ, Guenther RW (2001) Diffusion-weighted MR imaging for differentiation of benign fracture edema and tumor infiltration of the vertebral body. AJR Am J Roentgenol 176:351–358 35. Zidan DZ, Elghazaly HA (2014) Can unenhanced multiparametric MRI substitute gadolinium-enhanced MRI in the characterization of vertebral marrow infiltrative lesions? Egypt J Radiol Nucl Med 45:443–453 36. Del Vescovo R, Frauenfelder G, Giurazza F et al (2014) Role of whole-body diffusion-weighted MRI in detecting bone metastasis. Radiol Med 119:758–766 37. Hamimi A, Kassab F, Kazkaz G (2015) Osteoporotic or malignant vertebral fracture? This is the question. What can we do about it? Egypt J Radiol Nucl Med 46:97–103

Eur Radiol 38.

Mubarak F, Akhtar W (2011) Acute vertebral compression fracture: differentiation of malignant and benign causes by diffusion weighted magnetic resonance imaging. J Pak Med Assoc 61:555–558 39. Osman OM, Fahmy YR, El-Oraby AM, El-Basmy AA, Amin YE (2007) Role of diffusion WIs and T2 * GRE pulse sequences in dubious vertebral marrow pathological lesions. J Egypt Natl Canc Inst 19:254–262 40. Park SW, Lee JH, Ehara S et al (2004) Single shot fast spin echo diffusion-weighted MR imaging of the spine; Is it useful in differentiating malignant metastatic tumor infiltration from benign fracture edema? Clin Imaging 28:102–108 41. Tzeng YH, Chang TY, Huang GS, Lan GY, Hou WY, Shen HJ (2004) Diffusion-weighted MR imaging for differentiating acute benign from pathologic compression fractures: a reinvestigation of the usefulness of diffusion-weighted imaging. Chin J Radiol 29:109–115 42. Yao WW, Li MH, Yang SX, Zhu LL (2005) Use of diffusionweighted magnetic resonance imaging to differentiate between acute benign and pathological vertebral fractures: prospective study. J HK Coll Radiol 8:4–8 43. Biffar A, Baur-Melnyk A, Schmidt GP, Reiser MF, Dietrich O (2011) Quantitative analysis of the diffusion-weighted steady-state free precession signal in vertebral bone marrow lesions. Invest Radiol 46:601–609 44. Biffar A, Baur-Melnyk A, Schmidt GP, Reiser MF, Dietrich O (2010) Multiparameter MRI assessment of normal-appearing and diseased vertebral bone marrow. Eur Radiol 20:2679–2689 45. Geith T, Schmidt G, Biffar A et al (2012) Comparison of qualitative and quantitative evaluation of diffusion-weighted MRI and chemicalshift imaging in the differentiation of benign and malignant vertebral body fractures. AJR Am J Roentgenol 199:1083–1092 46. Geneidi EASH, Ali HI, Dola EF (2016) Role of DWI in characterization of bone tumors. Egypt J Radiol Nucl Med 47:919–927 47. Wonglaksanapimon S, Chawalparit O, Khumpunnip S, Tritrakarn SO, Chiewvit P, Charnchaowanish P (2012) Vertebral body compression fracture: discriminating benign from malignant causes by diffusion-weighted MR imaging and apparent diffusion coefficient value. J Med Assoc Thai 95:81–87 48. Taskin G, Incesu L, Aslan K (2013) The value of apparent diffusion coefficient measurements in the differential diagnosis of vertebral bone marrow lesions. Turk J Med Sci 43:379–387 49. Tardos MY, Louka AL (2016) Discrimination between benign and malignant in vertebral marrow lesions with diffusion weighted MRI and chemical shift. Egypt J Radiol Nucl Med 47:557–569 50. Sung JK, Jee WH, Jung JY et al (2014) Differentiation of acute osteoporotic and malignant compression fractures of the spine: use of additive qualitative and quantitative axial diffusionweighted MR imaging to conventional MR imaging at 3.0 T. Radiology 271:488–498

51.

Pui MH, Mitha A, Rae WI, Corr P (2005) Diffusion-weighted magnetic resonance imaging of spinal infection and malignancy. J Neuroimaging 15:164–170 52. Pozzi G, Albano D, Messina C et al (2017) Solid bone tumors of the spine: diagnostic performance of apparent diffusion coefficient measured using diffusion-weighted MRI using histology as a reference standard. J Magn Reson Imaging. https://doi.org/10.1002/jmri.25826 53. Park S, Kwack KS, Chung NS, Hwang J, Lee HY, Kim JH (2017) Intravoxel incoherent motion diffusion-weighted magnetic resonance imaging of focal vertebral bone marrow lesions: initial experience of the differentiation of nodular hyperplastic hematopoietic bone marrow from malignant lesions. Skeletal Radiol 46:675–683 54. Park HJ, Lee SY, Rho MH et al (2016) Single-shot echo-planar diffusion-weighted MR imaging at 3T and 1.5T for differentiation of benign vertebral fracture edema and tumor infiltration. Korean J Radiol 17:590–597 55. Geith T, Schmidt G, Biffar A et al (2014) Quantitative evaluation of benign and malignant vertebral fractures with diffusion-weighted MRI: what is the optimum combination of b values for ADC-based lesion differentiation with the single-shot turbo spin-echo sequence? AJR Am J Roentgenol 203:582–588 56. Fawzy F, Tantawy HI, Ragheb A, Abo Hashem S (2013) Diagnostic value of apparent diffusion coefficient to differentiate benign from malignant vertebral bone marrow lesions. Egypt J Radiol Nucl Med 44:265–271 57. Abowarda MH, Abdel-Rahman HM, Taha MM (2017) Differentiation of acute osteoporotic from malignant vertebral compression fractures with conventional MRI and diffusion MR imaging. Egypt J Radiol Nucl Med 48:207–213 58. Abo Dewan KAW, Salama AA, El Habashy HMS, Khalil AES (2015) Evaluation of benign and malignant vertebral lesions with diffusion weighted magnetic resonance imaging and apparent diffusion coefficient measurements. Egypt J Radiol Nucl Med 46:423–433 59. Koh DW, Collins DJ (2007) Diffusion-weighted MRI in the body: applications and challenges in oncology. AJR Am J Roentgenol 188:1622–1635 60. Kwee TC, Takahara T, Ochiai R, Nievelstein RAJ, Luiten PR (2008) Diffusion-weighted whole-body imaging with background body signal suppression (DWIBS): features and potential applications in oncology. Eur Radiol 18:1937–1952 61. Donati OF, Chong D, Nanz D et al (2014) Diffusion weighted MR imaging of upper abdominal organs: field strength and intervendor variability of apparent diffusion coefficients. Radiology 270:454–463 62. Dale BM, Braithwaite AC, Boll DT, Merkle EM (2010) Field strength and diffusion encoding technique affect the apparent diffusion coefficient measurements in diffusion-weighted imaging of the abdomen. Invest Radiol 45:104–108

Suggest Documents