Prediction of pathologic femoral fractures in patients with lung cancer ...

3 downloads 0 Views 709KB Size Report
the quality of life and increase the mortality in patients with lung cancer. Prophylactic ... 3 Department of Orthopedic Surgery, Samsung Medical Center,. Sungkyunkwan ... Health Insurance Portability and Accountability Act. (HIPAA)-compliant ...
Journal of

Orthopaedic Surger y

Article

Prediction of pathologic femoral fractures in patients with lung cancer using machine learning algorithms: Comparison of computed tomographybased radiological features with clinical features versus without clinical features

Journal of Orthopaedic Surgery 25(2) 1–7 ª The Author(s) 2017 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/2309499017716243 journals.sagepub.com/home/osj

Eunsun Oh1,2, Sung Wook Seo3, Young Cheol Yoon1, Dong Wook Kim3, Sunyoung Kwon4 and Sungroh Yoon4

Abstract Purpose: The purpose of this article is to compare the predictive power of two models trained with computed tomography (CT)-based radiological features and both CT-based radiological and clinical features for pathologic femoral fractures in patients with lung cancer using machine learning algorithms. Methods: Between January 2010 and December 2014, 315 lung cancer patients with metastasis to the femur were included. Among them, 84 patients who underwent CT scan and were followed up for more than 3 months were enrolled. We examined clinical and radiological risk factors affecting pathologic fracture through logistic regression. Predictive analysis was performed using five different supervised learning algorithms. The power of predictive model trained with CT-based radiological features was compared to those trained with both CT-based radiological and clinical features. Results: In multivariate logistic regression, female sex (odds ratio ¼ 0.25, p ¼ 0.0126), osteolysis (odds ratio ¼ 7.62, p ¼ 0.0239), and absence of radiation therapy (odds ratio ¼ 10.25, p ¼ 0.0258) significantly increased the risk of pathologic fracture in proximal femur. The predictive model trained with both CT-based radiological and clinical features showed the highest area under the receiver operating characteristic curve (0.80 + 0.14, p < 0.0001) through gradient boosting algorithm. Conclusion: We believe that machine learning algorithms may be useful in the prediction of pathologic femoral fracture, which are multifactorial problem. Keywords femoral metastasis, machine learning algorithm, pathologic fractures, predictive analytics

Introduction Pathologic femoral fractures due to lung cancer decrease the quality of life and increase the mortality in patients with lung cancer. Prophylactic surgery of impending fracture improves the quality of life and minimizes surgeryassociated complications that often occur in patients who sustain a fracture and are subsequently bedridden.1,2 However, early surgery for bone metastasis in the femur is controversial. The one reason is that most cases of bone metastasis can be controlled with conservative treatments such as radiation therapy and chemotherapy.3,4 Another

1

Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea 2 Department of Radiology, Soonchunhyang University Seoul Hospital, Seoul, Korea 3 Department of Orthopedic Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea 4 Department of Electrical and Computer Engineering and Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea Corresponding author: Sung Wook Seo, Department of Orthopedic Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-gu, Seoul 06351, Republic of Korea. Email: [email protected]

Creative Commons CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-Non Commercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

2 reason is that the assessment methods for fracture risk are still inaccurate. The accuracy of the canonical Mirel’s scoring system was reportedly less than 50%.5 Computed tomography (CT)-based structural rigidity analysis was recently developed and showed better performance; however, its accuracy was still as low as 63%.6,7 Previous studies enrolled patients with femoral metastasis derived from diverse cancers, which often resulted in different clinical courses during the progression of disease. More importantly, clinical features such as body weight, height, and response to chemotherapy or radiation therapy were affinitive to metastatic femoral fractures. Thus, only the image-based analysis for predicting fractures may have limitations. Machine learning algorithms are computational learning methods that aid in decision-making by analyzing data. These algorithms have shown better performance in predictive models because they improve accuracy by combining many features that are effective but statistically insignificant. We hypothesized that machine learning algorithms will improve the predictive accuracy of fracture in the metastatic femur by combining CT-based radiological features with clinical features. In this study, the power of predictive model for pathological fracture trained with CT-based radiological features was compared to those trained with both CT-based radiological features and clinical information.

Methods The institutional review board approved this retrospective Health Insurance Portability and Accountability Act (HIPAA)-compliant study and waived informed patient consent (2016-06-086).

Patients Between January 2010 and December 2014, 761 lung cancer patients were diagnosed with bone metastasis through pathological or clinical confirmation. Of these, 315 patients had bone metastasis to the proximal femur. The patients who underwent CT scan and were followed-up for more than 3 months were included in this study. To avoid the bias due to surgical decisions, we excluded the patients who underwent surgery without imaging evidence of fractures, which is impending fractures. Among 315 patients, 154 patients performed CT scan. Thirty-seven patients were excluded because they died within 3 months after CT scan, and 33 patients were excluded because the quality of their CT images was not appropriate for bone evaluation. Thus, 84 patients were ultimately enrolled in this study.

Journal of Orthopaedic Surgery 25(2) the fracture occurred within 3 months after CT scan. We considered the patients who underwent chemotherapy within 3 months before CT scan as the recent chemotherapy group and who underwent radiation therapy for a metastatic lesion of the proximal femur as the radiation treatment group. Based on simple radiographs, an osteolytic group was defined when radiolucency in the lesion was increased compared to that on the contralateral side. The amount of cortical disruption and the character of pain were not evaluated to avoid inter-/intra-observer errors.

CT protocol and evaluation CT scans were performed with a GE LightSpeed RT 16 system (GE Healthcare, Milwaukee, Wisconsin, USA), from the lower lumbar to the distal femur level, with 120 kVp, tube current of 250 mA, and slice thickness of 5 mm. The density and length of the lesion, density of affected bone and contralateral bone, relative density of the affected bone, extent and severity of cortical disruption, and distance between the tumor center and body midline were measured and evaluated by a musculoskeletal radiologist using a picture archiving and communication system (Centricity Radiology RA 1000, GE Healthcare, Chicago, Illinois, USA). The density of the lesion was measured by placing a free-hand drawn region of interest in the lesion at the slice of the largest lesion. The length of the lesion was represented by multiplying slice thickness by the number of affected slice. The density of affected bone and contralateral bone was defined as the Hounsfield unit of the entire area of affected bone including both the cortical bone and bone marrow at the slice of the largest lesion and that of the same slice in the contralateral limb, respectively. The relative density of the affected bone was defined as the ratio of the density of affected bone to contralateral bone (Figure 1). The extent of cortical disruption was graded into four grades: grade 1, cortical disruption involving 75% of the entire perimeter. The severity of cortical disruption was classified as grade 1 when the outer cortex is intact or grade 2 when the full thickness of cortical bone is destructed. The distance between the tumor center and body midline was the shortest distance between the tumor center and the line connecting the symphysis pubis and the sacral center.

Outcome measure

Analysis pipeline

The study cohort was followed-up for at least 3 months after CT scan. A pathological fracture was defined when

We extracted radiological features from CT scan and clinical features from the known independent variables that

Oh et al.

3

Figure 1. Extraction of radiological features from CT scan. (a) Density of the lesion. (b) Density of the affected bone. (c) Density of the contralateral bone. (d) Distance between the tumor center and body midline. CT: computed tomography.

Table 1. CT-based radiological features and clinical features. CT-based radiological features

Clinical features

Length of the lesion (mm) Density of the lesion (HU) Density of the affected bone (HU) Density of the contralateral bone (HU) Relative density of the affected bone Extent of cortical disruption Severity of cortical disruption Distance between the tumor center and body midline

Sex Age Chemotherapy Radiation therapy Presence of osteolysis Bodyweight (kg) Height (cm) Body mass index (kg/m2)

CT: computed tomography; HU: Hounsfield unit.

were expected to affect either tumor progression or mechanical loading including simple radiograph (Table 1). All CT-based radiological and clinical values were preprocessed using L2-normalization and the outcomes were labeled with binary features. If a patient developed pathologic fracture within 3 months after CT scan, the outcome was labeled as 1. If a patient did not have fracture within 3 months after CT scan, the outcome was labeled as 0. We performed predictive analysis with the data of 84 patients and 16 different features. After the normalization of features, the data set was randomly split into 85% train set and 15% test set using N ¼ 100 bootstrap resampling. The classifiers were optimized by correcting hyperparameters with five-fold cross-validation. The optimized classifiers were trained with the bootstrap train samples and tested with the corresponding bootstrap test samples. The process was performed iteratively across all 100 bootstrap

resampled data to evaluate the predictive power of each classifier (Figure 2).

Machine learning classifier The classifiers used for the current study included AdaBoost, support vector machine (SVM), gradient boosting (GB), linear discriminant analysis (LDA), and random forest. Each classifier was optimized by correcting hyperparameters by means of five-fold cross-validation with the training data set. The optimized hyperparameters of each classifier were listed in Table 2. Python 2.7.3 (Python Software Foundation, Beaverton, Oregon, USA) with the scikitlearn, Matplotlib, SciPy, and NumPy packages was used for the study.8

Statistical analysis To identify clinical risk factors for pathologic fracture, logistic regression analysis was performed. Clinical features were used as independent variables and multivariate logistic regression was performed with MedCalc (version 12.7; MedCalc Software, Ostend, Belgium). We evaluated the predictive performances of the predictive models by the receiver operating characteristic (ROC) analysis, and we compared the areas under the ROC curve (AUC) of combination of CT-based radiological and clinical features with only CT-based radiological features. The AUC of each classifier was evaluated and compared among each other using the Mann–Whitney test.9 To predict fracture, we analyzed the ROC curve of the best classifier and determined the cutoff value by maximizing accuracy. Based on the optimal cutoff value, we evaluated sensitivity and

4

Journal of Orthopaedic Surgery 25(2)

Clinical risk factors for fracture in the metastatic proximal femur: Descriptive analysis using multivariate logistic regression analysis To identify clinical risk factors for pathologic fracture in the proximal femur within 3 months, we performed logistic regression analysis. In multivariate logistic regression, female sex (odds ratio ¼ 0.25, p ¼ 0.0126), osteolysis (odds ratio ¼ 7.62, p ¼ 0.0239), and absence of radiation therapy (odds ratio ¼ 10.25, p ¼ 0.0258) significantly increased the risk of pathologic fracture in proximal femur.

Predictive analysis with machine learning algorithm: Comparison of combination of CT-based radiological and clinical features versus only CT-based radiological features

Figure 2. Pipeline of data analysis. Prediction analysis was performed on the data of 84 patients and 16 different features. The data set was randomly split into 85% train set and 15% test set using N ¼ 100 bootstrap resampling. The classifiers were optimized by correcting hyperparameters with five-fold crossvalidation. The optimized classifiers were trained with the bootstrap train samples and tested with the corresponding bootstrap test samples. The process was performed iteratively across all 100 bootstrap resampled data to evaluate the predictive power of each classifier.

specificity. All variables with p < 0.05 were considered to be statistically significant.

Results Patient population The mean age of the patients in this study was 64.07 years. Thirty-two patients were female and 52 patients were male. Six patients had small cell lung cancer, and the other 78 patients had non-small cell lung cancer. All patients had bone metastasis in the proximal femur; 43 metastases were on the right femur, 28 were on the left femur, and 13 were on the bilateral femurs. The average Karnofsky performance status score in patients was 70 of 100; no patients scored under 30 or could not walk. Fifty-two patients had chemotherapy within 3 months. Of these, 30 patients had molecularly targeted therapy and 22 patients had treatment with conventional agents. Seventy-seven patients had external beam radiation therapy on the femoral lesion during the follow-up period (Table 3).

The optimized classifiers were trained with the bootstrap training samples and tested with the corresponding bootstrap test samples. In the group of the predictive models trained with CT-based radiological features, LDA showed the highest AUC (0.62 + 0.19). In the group of the predictive models trained with CT-based radiological and clinical features, GB showed the highest AUC (0.80 + 0.14; Figure 3). The best predictive models of both groups were compared using the Mann–Whitney test. The GB predictive model trained with CT-based radiological and clinical features showed significantly higher accuracy (0.76 + 0.13; p < 0.0001) and AUC (0.80 + 0.14; p < 0.0001) compared to accuracy (0.62 + 0.03) and AUC (0.62 + 0.19) of the LDA predictive model trained with CT-based radiological features (Figure 4). Based on the ROC curves of GB predictive model, we found the cutoff value was 0.65. Based on the cutoff value, the specificity was 80% and the sensitivity was 65%.

Discussion Machine learning algorithms combine many features together and construct an optimized mathematical equation to predict the target with least error. They can suggest an optimal direction within numerous dimensions that are not apparent to humans. Thus, machine learning algorithms should be useful in determining an optimal approach to multifactorial events such as pathological fractures. In this study, we were able to combine CT-based radiological and clinical features together to predict pathological femoral fractures using machine learning algorithms. We demonstrated that predictive power of one predictive model trained with only CT-based radiological features using machine learning algorithms was similar with previous CT-based structural rigidity analysis, which showed 63% predictive accuracy. However, we demonstrated that the other predictive model trained with combination of CTbased radiological and clinical features showed superior predictive power for pathologic femoral fracture.

Oh et al.

5

Table 2. Optimized supervised learning algorithms used for training.a Supervised learning algorithms

Hyperparameters

AdaBoost Gradient boosting Linear discriminant analysis Random forest SVC (R) SVC (L)

N_estimators ¼ 280, algorithm ¼ SAMME.R, learning_rate ¼ 0.08 N_estimators ¼ 200, max_depth ¼ 5, learning_rate ¼ 0.05 Solver ¼ singular value decomposition N_estimators ¼ 200, max_depth ¼ 6, max_features ¼ 1 SVC (max_iter ¼ 100, Kernel ¼ RBF, C ¼ 0.1, probability ¼ true) SVC (max_iter ¼ 100, Kernel ¼ linear, C ¼ 0.1, probability ¼ true)

SVC: support vector machine. a The algorithms were optimized by correcting hyperparameters with five-fold cross-validation.

Table 3. Demographic characteristics of the study population. Characteristic Lung cancer Small cell lung cancer Non-small cell lung cancer Metastasis location Right femur Left femur Both femurs Chemotherapy Chemotherapy within 3 months No chemotherapy within 3 months Chemotherapy regimen Targeted therapy Conventional therapy RT RT to the femur during follow-up No RT to the femur during follow-up Fracture Fracture within 3 months No fracture within 3 months

Number

Percentage (%)

6 78

7.1 92.9

43 28 13

51.2 33.3 15.5

52 32

61.9 38.1

30 22

57.7 42.3

77 7

91.7 8.3

60 24

71.4 28.6

RT: radiation therapy.

Figure 3. Predictive performances of the predictive models: combination of CT-based radiological and clinical features versus only CT-based radiological features. (a) ROC curve of predictive model trained with CT-based radiological features and clinical features. (b) ROC curve of predictive model trained with only CT-based radiological features. CT: computed tomography; ROC: receiver operating characteristic; SVM: support vector machine; LDA: linear discriminant analysis; GB: gradient boosting; RF: random forest.

6

Journal of Orthopaedic Surgery 25(2)

Figure 4. Comparison of two machine learning classifier: gradient boosting versus linear discriminant analysis. (a) ROC curve of gradient boosting and linear discriminant analysis. (b) Box-and-whisker graph for AUC of gradient boosting and linear discriminant analysis. The AUC of the gradient boosting model was significantly higher than that of the other classifiers (p < 0.0001). ROC: receiver operating characteristic; AUC: area under the ROC curve; GB: gradient boosting; LDA: linear discriminant analysis.

The GB is an ensemble method that uses multiple learning algorithms to obtain better predictive performance. The GB generates a predictive model through sequential education of decision tree. The sequential tree modifies the prediction errors of the previous tree to improve a predictive accuracy. In this study, through comparison analysis, we demonstrated that the GB was the best classifier for prediction of pathologic femoral fracture from five machine learning methods (AdaBoost, SVM, GB, LDA, and random forest). Meanwhile, recent predictive models for pathologic fractures due to metastatic bone tumors were based on heterogeneous data from various types and locations of cancers.10–13 To reduce the bias caused by the heterogeneity of study group, we only studied a homogeneous group limited to patients with lung cancer. Furthermore, the location of metastasis was restricted to the proximal femur, which is the most vulnerable site to pathologic fractures. We believed that reducing heterogeneity was important to improve the accuracy of the outcome.

Conclusion We found that the predictive model trained with combination of CT-based radiological and clinical features had superior predictive power for pathologic femoral fractures in patients with lung cancer. We believe that machine learning algorithms may be useful in the prediction of pathologic femoral fracture, which are multifactorial problem. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

References 1. Arvinius C, Parra JL, Mateo LS, et al. Benefits of early intramedullary nailing in femoral metastases. Int Orthop 2014; 38(1): 129–32. 2. Saad F, Lipton A, Cook R, et al. Pathologic fractures correlate with reduced survival in patients with malignant bone disease. Cancer 2007; 110(8): 1860–1867. 3. Coleman RE. Metastatic bone disease: clinical features, pathophysiology and treatment strategies. Cancer Treat Rev 2001; 27(3): 165–176. 4. Wedin R. Surgical treatment for pathologic fracture. Acta Orthop Scand Suppl 2001; 72(302): 2p., 1–29. 5. Mirels H. Metastatic disease in long bones: a proposed scoring system for diagnosing impending pathologic fractures. Clin Orthop Relat Res 1989; 249(12): 256–264. 6. Jawad MU and Scully SP. In brief: classifications in brief: Mirels’ classification: metastatic disease in long bones and impending pathologic fracture. Clin Orthop Relat Res 2010; 468(10): 2825–2827. 7. Snyder BD, Hauser-Kara DA, Hipp JA, et al. Predicting fracture through benign skeletal lesions with quantitative computed tomography. J Bone Joint Surg Am 2006; 88(1): 55–70. 8. Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Machine Learn Res 2011; 12(10): 2825–2830. 9. DeLong ER, DeLong DM and Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating

Oh et al. characteristic curves: a nonparametric approach. Biometrics 1988; 44(3): 837–845. 10. Hipp JA, Springfield DS and Hayes WC. Predicting pathologic fracture risk in the management of metastatic bone defects. Clin Orthop Relat Res 1995; 312(3): 120–135. 11. Dijkstra PD, Oudkerk M and Wiggers T. Prediction of pathological subtrochanteric fractures due to metastatic

7 lesions. Arch Orthop Trauma Surg 1997; 116(4): 221–224. 12. Bunting R, Lamont-Havers W, Schweon D, et al. Pathologic fracture risk in rehabilitation of patients with bony metastases. Clin Orthop Relat Res 1985; 192(1): 222–227. 13. Damron TA and Ward WG. Risk of pathologic fracture: assessment. Clin Orthop Relat Res 2003; 415(suppl): S208–S211.