Discovery and validation of novel expression

SUPPLEMENTARY INFORMATION Discovery and Validation of Novel Expression Signature for Postcystectomy Recurrence in High-Risk Bladder Cancer

Anirban P. Mitra, Lucia L. Lam, Mercedeh Ghadessi, Nicholas Erho, Ismael A. Vergara, Mohammed Alshalalfa, Christine Buerki, Zaid Haddad, Thomas Sierocinski, Timothy J. Triche, Eila C. Skinner, Elai Davicioni, Siamak Daneshmand, Peter C. Black

Journal of the National Cancer Institute

DOI: 10.1093/jnci/dju290

Discovery and Validation of Novel Expression Signature for Postcystectomy Recurrence in High‐Risk Bladder Cancer

CONTENTS

Supplementary Methods ........................................................................................................... .... 3 Specimen processing and initial microarray analysis ............................................................. 3 Development and validation of prognostic classifiers ............................................................ 4 Sample size assessment .......................................................................................................... 6 Statistical analyses .................................................................................................................. 6 Analysis of biological interactions between prognostic markers ........................................... 7 Comparison with prior signatures ........................................................................................... 8 Independent validation of genomic markers in external datasets ........................................... 9 Supplementary Results ..................................................................................................................11 Overlap of genomic markers with prior signatures and their relevance to cancer ..................11 Supplementary References ............................................................................................................12 Supplementary Figure 1 ................................................................................................................ 14 Supplementary Figure 2 ................................................................................................................ 15 Supplementary Figure 3 ................................................................................................................ 16 Supplementary Figure 4 ................................................................................................................ 17 Supplementary Figure 5 ................................................................................................................ 18 Supplementary Figure 6 ................................................................................................................ 19 Supplementary Figure 7 ................................................................................................................ 20 Supplementary Figure 8 ................................................................................................................ 21 Supplementary Figure 9 ................................................................................................................ 22 Supplementary Figure 10 .............................................................................................................. 23 Supplementary Table 1 ............................................................................................................ ..... 24 Supplementary Table 2 ............................................................................................................ ..... 25 Supplementary Table 3 ........................................................................................................... ...... 26 Supplementary Table 4 ........................................................................................................... ...... 27

Page 2 of 27 │ Supplementary Information │ JNCI

Mitra AP et al.


SUPPLEMENTARY METHODS

Specimen Processing and Initial Microarray Analysis Archival formalin-fixed paraffin-embedded (FFPE) primary tumor specimens were obtained for 225 study patients following radical cystectomy. All tumor specimens underwent histopathological rereview to confirm the diagnosis of urothelial carcinoma, reassess tumor stage, grade and other pathological characteristics, and ensure sampling of an area that was most representative of overall tumor histology. Hematoxylin/eosin-stained sections were used to guide tumor specimen selection using a 1 mm diameter core punch. Total RNA was extracted and purified using RNeasy FFPE kit (Qiagen, Valencia, CA). RNA was amplified and labeled using the Ovation WTA FFPE system (NuGen, San Carlos, CA) and hybridized to GeneChip Human Exon 1.0 ST oligonucleotide microarrays (Affymetrix, Santa Clara, CA) according to the manufacturer’s recommendations. GeneChip Human Exon arrays use 5,362,207 probes to interrogate over one million exon clusters (collections of overlapping exons) with over 1.4 million probe selection regions (PSRs), referred to as features in this study.

Microarray data quality control was assessed by Affymetrix Power Tools packages and internally developed metrics (1). A total of 26 samples failed quality control. The median (interquartile range [IQR]) ages of the FFPE tissues for the resulting discovery (n = 133) and validation (n = 66) sets were comparable at 12.5 (9.9–15.9) and 13.1 (10.7–17.1) years, respectively (P = .21). Array files for these cases are available from the National Center for Biotechnology Information’s

Gene

Expression

(http://www.ncbi.nlm.nih.gov/geo/)

Mitra AP et al.

under

Omnibus GEO

accession

(NCBI–GEO) code

GSE57933.

database Feature

JNCI │ Supplementary Information │ Page 3 of 27


summarization and normalization were performed by frozen robust multi-array analysis (fRMA), which is available through Bioconductor (2). A custom set of frozen vectors was generated by randomly selecting 10 arrays from each of the eight batches across the study. Features interrogated with fewer than four probes, any cross-hybridizing probes as defined by Affymetrix, or deemed unreliable due to non-unique mapping to the genome (defined by probe sequence realignment to hg19) were removed.

Development and Validation of Prognostic Classifiers With bioinformaticians who generated the prognostic classifiers remaining blinded to clinical data, two-thirds of the cohort was assigned to a discovery set and one-third to a validation set, keeping clinicopathologic characteristics balanced between both sets by testing their distribution using Fisher’s exact test, chi-square test, ANOVA or logistic regression. Features identifying patients who did and did not recur were visualized using normal quantile-quantile plots of t-test statistics using the qqnorm function in R v2.15.2. To identify features most clinically relevant to recurrence-free survival (RFS), area under the receiver-operating characteristic (ROC) curve, t-test statistics and median fold difference (MFD) were calculated in the discovery set. Based on these metrics, features were filtered by area under ROC curve (AUC) > 0.6, unadjusted t-test P < .050 and MFD > 1.5. Selected features were combined as a signature to produce a genomic classifier (GC) score by a random forest algorithm using the randomForest R package with 50,000 trees based on the discovery set (3). The mtry and nodesize parameters of the model were first assessed using the tune function in the e1071 R package (4). Briefly, a 20 (nodesize: 25 to 100 in increments of 5) by 5 (mtry: 1, 3, 5, 10, 15) grid was searched using 10-fold crossvalidation with 1,000 trees built per try. The top 10 performing parameter pairs (nodesize and


Mitra AP et al.


mtry pairs based on the 20 by 5 grid search) were further fine-tuned by manual assessment with 50,000 trees in the random forest model, and selected by minimizing the out-of-bag error in the discovery set. The randomForest package was also used to assess the variable importance of each selected feature. The GC outputs a continuous score between 0 and 1, with higher scores indicating higher probability of recurrence.

A “clinical-only” classifier (CC) was also developed on the discovery set. This incorporated age, gender, pathological stage, and lymphovascular invasion status modeled using logistic regression. Interactions between different clinicopathologic factors were explored in the discovery set and applied to the model when significant. A complete model with all clinicopathologic variables and significant interaction terms was built. Several nested models were subsequently developed that excluded less significant variables. Analysis of variance and chi-squared tests were used to select between these nested models. If the difference between two nested models was not significant, then the model with fewer variables was used.

Risk scores for postcystectomy recurrence based on clinicopathologic variables were also calculated for each patient based on the clinical nomogram from the International Bladder Cancer Nomogram Consortium (IBCNC) (5). To evaluate the joint prognostic value of genomic information and clinicopathologic variables, GC was combined with IBCNC and CC by logistic regression in the discovery set into integrated G-IBCNC and G-CC, respectively. Analogous to GC, patients were scored using the above classifiers between 0 and 1. All locked genomic, clinical and genomic-clinicopathologic classifiers were applied to patients in the validation set in a blinded fashion.

Mitra AP et al.



Sample Size Assessment Sizes of the discovery and validation sets following microarray data quality control were assessed for adequacy for developing classifiers using high dimensional data (6). A discovery set of 133 patients (with 68 cases that recurred) was expected to produce a classifier with accuracy within 5% of an optimal classifier if the largest standardized fold change was as low as 1.1. Additionally, a validation set of 66 patients (with 33 cases that recurred) would have 97% power to achieve a minimum AUC of 0.75 (based on Diagnostic Test module of PASS v11.0.8).

Statistical Analyses Univariable prognostic abilities of classifiers were compared using discrimination boxplots, Wilcoxon rank-sum test and logistic regression. Boxes in the boxplots represent score quartiles, and notches were calculated using boxplot.stat function in R based on the formula by McGill et al (7,8). This method estimates the 95% confidence interval (CI) for the median and is insensitive to underlying distributions of the sample, thereby allowing comparison of two medians. Cumulative incidence curves for RFS were constructed using Fine-Gray competing risks analysis (9). AUCs were used for classifier performance assessment with respect to binary outcomes using the pROC R library (10). Extension of AUC for censored data was also used to compare performance of genomic classifiers, external signatures and clinical nomograms with respect to RFS (survival-ROC analysis using survival R library) (11). Time-dependent survival ROCs were evaluated for prediction of recurrence within four years postcystectomy (12). For survival ROC, the nearest-neighbor estimator with λ = 0.002 was used to approximate survival function density. The 95% CIs for survival-ROC AUCs were approximated through bootstrapping. Decision curve analyses were used to assess the net clinical benefit of genomic


Mitra AP et al.


versus clinical models across different threshold probabilities (13). Reassignment of patients to risk strata based on addition of genomic information was assessed using reclassification plots (14).

Importance of genomic-based classifiers relative to individual and combined clinical information, and their independent prognostic abilities were evaluated by multivariable Cox regression models. Proportional hazards assumptions of the Cox model were confirmed by evaluating the scaled Schoenfeld residuals (15). For univariable and multivariable analyses, classifier scores were assessed as continuous variables (step size = 0.1) unless dichotomization was required. For analyses where classifier score was categorized, majority rule criterion with classifier scores of ≥ 0.5 and < 0.5 grouped as high-risk and low-risk, respectively, was used.

Estimates of censoring distribution were used to calculate follow-up duration (16). Analyses were performed using R v2.15.2. All tests were two-sided with type I error probability of 5%.

Analysis of Biological Interactions between Prognostic Markers To understand the biological interactions of constituent features within GC, first-degree partners of their respective genes were extracted using the Human Signaling Network v5, a curated database with nearly 63,000 directed and undirected interactions between approximately 6,300 proteins (17–20). Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 was used to assess biological processes, molecular functions and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the gene network (21–23). Enriched processes, molecular functions or KEGG pathways with Benjamini-Hochberg-corrected P < .050

Mitra AP et al.



were selected (24). Enrichment Map, a network-based gene set enrichment visualization method implemented as a Cytoscape plugin, was used to visualize the highly enriched biological concepts and group similar terms together (25). Nodes in the Enrichment Map represent individual Gene Ontology (GO) terms or KEGG pathways and lines represent the amount of overlap between the terms.

Comparison with Prior Signatures The performance of GC was compared to that of previously described prognostic signatures for muscle-invasive urothelial carcinoma of the bladder (UCB). We identified five studies that defined seven prognostic genomic signatures developed exclusively for muscle-invasive or node-positive UCB (Supplementary Table 3) (26–30). As these signatures were developed using early-generation microarrays with relatively limited genomic coverage, it was feasible to reconstruct the gene lists on our study cohort as GeneChip Human Exon 1.0 ST microarrays offer transcriptome-wide coverage with the ability to interrogate entire lengths of protein-coding genes and noncoding RNAs. Genomic markers and genes that comprised the individual signatures were mapped back to the associated GeneChip Human Exon array core transcript clusters where possible, or extended or full transcript clusters when not aligned with the core transcript cluster. When an appropriate transcript-level feature was not identified, the closest one to the probe on the original array was used. While some prior signatures were modeled using other machine learning algorithms, their codes were not publicly available. Moreover, other prior studies only presented a list of differentially expressed genes. Therefore, summarized expression values for transcripts in each signature were combined using random forest as the standard machine learning algorithm for modeling and to provide a comparison against GC. Nodesize and


Mitra AP et al.


mtry parameters were optimized for each model to minimize the out-of-bag error rate, thereby maximizing their prognostic performance in the discovery set. As with GC, the optimized models of the prior signatures were then applied in a blinded manner on the validation set.

Independent Validation of Genomic Markers in External Datasets The combined prognostic ability of genes represented by the GC features was independently validated on four external UCB datasets using the SurvExpress biomarker validation tool (31). The first dataset comprised of patients with chemotherapy-naïve, muscle-invasive, high-grade tumors (T2-T4aNxMx) accessed through the Cancer Genome Atlas (TCGA) database for bladder urothelial carcinoma (n = 54) (32). Three datasets were accessed through the NCBI–GEO database. Dataset accession GSE13507 (n = 164) included UCB patients across all tumor stages and grades (26). Dataset accession GSE5287 (n = 30) included patients with locally advanced or metastatic disease (33). Dataset accession GSE31684 (n = 93) included patients with nonmuscle-invasive and muscle-invasive disease with no evidence of metastatic disease at radical cystectomy (27). UCB patients from each dataset with publicly available gene expression and clinical information were included. While the external datasets included high-risk bladder cancer patients, their composition did not optimally resemble the discovery and validation cohorts in this study. Nevertheless, an attempt was made to validate GC in these datasets based on the hypothesis that the constituent markers were generally prognostic for high-risk bladder cancer. Official symbols of genes associated with each of the constituent features within GC were provided as input. Briefly, for each dataset, the probes associated with each gene were identified where available, and the prognostic index of each patient was estimated using a Cox model. Each cohort was then divided into low-risk and high-risk groups defined by a median split.

Mitra AP et al.



Discrimination ability of the GC was then assessed by hazard-ratio estimate, concordance index, and log-rank test of differences between risk groups with associated Kaplan-Meier curves using default settings. Overall survival was used as the common denominator for comparison as it was the only clinical endpoint available for all four datasets.


Mitra AP et al.


SUPPLEMENTARY RESULTS

Overlap of Genomic Markers with Prior Signatures and Their Relevance to Cancer Previously identified genes associated with UCB were cataloged by a thorough PubMed search for all publications in English listed through January 2014. This resulted in the identification of 2,059 unique genes. A comparison with the markers represented within GC revealed that four genes had been previously documented as being associated with UCB:



MECOM:

Associated with tumor grade (34); progression in non-muscle-invasive bladder cancer (35).



PPP1R12A:

Prognostic for overall survival in muscle-invasive bladder cancer (27); associated with outcome in patients with muscle-invasive and nodepositive disease (28).



SYPL1:

Associated with tumor stage (36).



ARFGEF1:

Associated with tumor stage (34).

MECOM corresponds to the most important feature within GC (Supplementary Figure 3). Human Signaling Network analysis showed that the protein encoded by MECOM can interact with CTBP1, SMAD3, CREBBP, MAPK8 and MAPK9, and has been associated with clinical outcomes in leukemia and ovarian cancer (37,38).

Mitra AP et al.



SUPPLEMENTARY REFERENCES

1. 2. 3. 4.

5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

Lockstone HE. Exon array data analysis using Affymetrix power tools and R statistical software. Brief Bioinform. 2011;12(6):634–644. McCall MN, Bolstad BM, Irizarry RA. Frozen robust multiarray analysis (fRMA). Biostatistics. 2010;11(2):242–253. Breiman L. Random forests. Machine Learning. 2001;45(1):5–32. Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F. Misc Functions of the Department of Statistics (e1071), TU Wien. v1.6-1. The Comprehensive R Archive Network; 2012. http://cran.r-project.org/web/packages/e1071/index.html. Published September 12, 2012. Bochner BH, Kattan MW, Vora KC. Postoperative nomogram predicting risk of recurrence after radical cystectomy for bladder cancer. J Clin Oncol. 2006;24(24):3967– 3972. Dobbin KK, Simon RM. Sample size planning for developing classifiers using highdimensional DNA microarray data. Biostatistics. 2007;8(1):101–117. Chambers JM, Cleveland WS, Kleiner B, Tukey PA. Graphical Methods for Data Analysis. New Jersey, NJ: Wadsworth & Brooks/Cole Publishing Company; 1983. McGill R, Tukey JW, Larsen WA. Variations of box plots. Am Stat. 1978;32(1):12–16. Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94(446):496–509. Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. Therneau T. A Package for Survival Analysis in S. R package v2.36-14. 2012. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56(2):337–344. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26(6):565–574. Pepe MS. Problems with risk reclassification methods for evaluating prediction models. Am J Epidemiol. 2011;173(11):1327–1335. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):515–526. Korn EL. Censoring distributions as a measure of follow-up in survival analysis. Stat Med. 1986;5(3):255–260. Cui Q, Ma Y, Jaramillo M, et al. A map of human cancer signaling. Mol Syst Biol. 2007;3:152. Awan A, Bari H, Yan F, et al. Regulatory network motifs and hotspots of cancer genes in a mammalian cellular signalling network. IET Syst Biol. 2007;1(5):292–297. Li L, Tibiche C, Fu C, et al. The human phosphotyrosine signaling network: evolution and hotspots of hijacking in cancer. Genome Res. 2012;22(7):1222–1230. Newman RH, Hu J, Rho HS, et al. Construction of human activity-based phosphorylation networks. Mol Syst Biol. 2013;9:655.


Mitra AP et al.


21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38.

Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37(1):1–13. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32(Database issue):D277–D280. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57(1):289–300. Merico D, Isserlin R, Stueker O, Emili A, Bader GD. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PLoS One. 2010;5(11):e13984. Kim WJ, Kim EJ, Kim SK, et al. Predictive value of progression-related gene classifier in primary non-muscle invasive bladder cancer. Mol Cancer. 2010;9:3. Riester M, Taylor JM, Feifer A, et al. Combination of a novel gene expression signature with a clinical nomogram improves the prediction of survival in high-risk bladder cancer. Clin Cancer Res. 2012;18(5):1323–1333. Sanchez-Carbayo M, Socci ND, Lozano J, Saint F, Cordon-Cardo C. Defining molecular profiles of poor outcome in patients with invasive bladder cancer using oligonucleotide microarrays. J Clin Oncol. 2006;24(5):778–789. Blaveri E, Simko JP, Korkola JE, et al. Bladder cancer outcome and subtype classification by gene expression. Clin Cancer Res. 2005;11(11):4044–4055. Kim WJ, Kim SK, Jeong P, et al. A four-gene signature predicts disease progression in muscle invasive bladder cancer. Mol Med. 2011;17(5-6):478–485. Aguirre-Gamboa R, Gomez-Rueda H, Martinez-Ledesma E, et al. SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis. PLoS One. 2013;8(9):e74250. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of urothelial bladder carcinoma. Nature. 2014;507(7492):315–322. Als AB, Dyrskjøt L, von der Maase H, et al. Emmprin and survivin predict response and survival following cisplatin-containing chemotherapy in patients with advanced bladder cancer. Clin Cancer Res. 2007;13(15):4407–4414. Lindgren D, Frigyesi A, Gudjonsson S, et al. Combined gene expression and genomic profiling define two intrinsic molecular subtypes of urothelial carcinoma and gene signatures for molecular grading and outcome. Cancer Res. 2010;70(9):3463–3472. Wang R, Morris DS, Tomlins SA, et al. Development of a multiplex quantitative PCR signature to predict progression in non-muscle-invasive bladder cancer. Cancer Res. 2009;69(9):3810–3818. Dyrskjøt L, Thykjaer T, Kruhoffer M, et al. Identifying distinct classes of bladder carcinoma using microarrays. Nat Genet. 2003;33(1):90–96. Ho PA, Alonzo TA, Gerbing RB, et al. High EVI1 expression is associated with MLL rearrangements and predicts decreased survival in paediatric acute myeloid leukaemia: a report from the Children’s Oncology Group. Br J Haematol. 2013;162(5):670–677. Nanjundan M, Nakayama Y, Cheng KW, et al. Amplification of MDS1/EVI1 and EVI1, located in the 3q26.2 amplicon, is associated with favorable patient prognosis in ovarian cancer. Cancer Res. 2007;67(7):3074–3084.

Mitra AP et al.



Supplementary Figure 1. Study design schema for classifier discovery and initial validation. A “clinical‐only” classifier (CC) based on clinicopathologic variables and a genomic classifier (GC) based on 15 RNA features predictive for cancer recurrence were developed on patients in the discovery set (n = 133). Postcystectomy recurrence probabilities were also calculated based on the International Bladder Cancer Nomogram Consortium (IBCNC) nomogram. GC and IBCNC were integrated into G‐IBCNC, and GC and CC were combined as G‐CC. All classifier models were then locked and applied in a blinded manner on patients in the validation set (n = 66).


Mitra AP et al.


Supplementary Figure 2. Normal quantile‐quantile plot of t‐test statistics for features separating patients who did and did not recur in the A) discovery set and B) validation set. Line fitting through the first quantile and third quantile are shown in black.

Mitra AP et al.



Supplementary Figure 3. Variable importance of the constituent features within GC. Importance for the 15 GC features was measured by their accuracy (left) and “Gini importance” (right) metrics using the random forest algorithm. The plots indicate the degree by which an importance measure decreased when each feature was removed individually. The higher the mean decrease, the more important a feature was for the model. GC = genomic classifier.


Mitra AP et al.


Supplementary Figure 4. Decision curve analyses comparing the net benefit of clinical‐only models versus genomic‐ clinicopathologic classifiers. Model performance is compared to extremes of classifying all patients as being at risk for recurrence (thus warranting treatment of all patients; gray curve) versus classifying no patients at risk (thus treating none; horizontal black line). The “decision‐to‐treat” threshold, which represents the probability of recurrence used to trigger a decision to treat, ranges from 0 to 100 with sensitivity and specificity of each prediction model calculated at each threshold to determine net benefit. An optimal classifier has high net benefit above the gray “treat all” curve. When comparing IBCNC versus G‐IBCNC (left panel), and CC versus G‐CC (right panel), net benefits of the respective genomic‐clinicopathologic classifiers were superior to clinical‐only models over a wide range of decision‐to‐treat thresholds. As an example, if a decision to administer adjuvant therapy was triggered by a 50% threshold probability for post‐cystectomy recurrence, then in comparison to “treat‐ all” or “treat‐none” scenarios (where no prediction models would be used), employing genomic‐clinicopathologic classifiers would reduce unnecessary treatment (i.e., for low‐risk patients) by 28% with G‐CC compared to 17% with IBCNC. IBCNC = postcystectomy recurrence nomogram from the International Bladder Cancer Nomogram Consortium; G‐IBCNC = integrated genomic‐IBCNC classifier; CC = “clinical‐only” classifier; G‐CC = integrated genomic‐CC classifier.

Mitra AP et al.



A

B

E

C

D

Supplementary Figure 5. Discrimination plots for patients in the validation set. Notched box plots indicate distributions of A) IBCNC, B) CC, C) GC, D) G‐IBCNC, and E) G‐CC scores of patients based on their recurrence status. Box represents median score and its 25th and 75th percentiles. Notches represent 95% confidence interval of the median; whiskers extend to 1.5 times the interquartile range from the median. Blue and red dots represent patients who did not recur and recurred, respectively. P values were determined by Wilcoxon rank‐sum test and are two‐sided. IBCNC = postcystectomy recurrence nomogram from the International Bladder Cancer Nomogram Consortium; CC = “clinical‐only” classifier; GC = genomic classifier; G‐IBCNC = integrated genomic‐IBCNC classifier; G‐CC = integrated genomic‐CC classifier.


Mitra AP et al.


A

B Supplementary Figure 6. Survival AUCs plotted over time for patients in the validation set. Survival AUCs determined over a range of timepoints following cystectomy indicate that A) G‐CC has the best performance among all patients, and B) G‐CC and GC have comparable performance that is superior to CC among the subset of node‐negative patients for predicting recurrence. AUC = area under receiver‐operating characteristic curve; CC = “clinical‐only” classifier; GC = genomic classifier; G‐CC = integrated genomic‐CC classifier.

Mitra AP et al.



Supplementary Figure 7. Distribution of GC scores across pathological stages among patients in the validation set. Dots and lines represent individual patient and median GC scores, respectively. Blue and red dots represent patients who did not recur and recurred, respectively. Median GC scores were higher in patients who recurred than those who did not experience recurrence at last follow‐up for patients with pT2N0M0 (0.64 versus 0.38), pT3‐4aN0M0 (0.59 versus 0.28) and pTanyN1‐3M0 (0.57 versus 0.34) disease. GC = genomic classifier.


Mitra AP et al.


Supplementary Figure 8. Distribution of classifier scores among patient subsets in the validation set based on nodal status. Among node‐negative patients (left panel), GC scores provided better discrimination based on recurrence status compared to CC scores (P = .008 versus P = .069, respectively). Among node‐positive patients (right panel), while comparative significance was not achieved for GC and CC scores (P = .16 versus P = .41, respectively) due to the limited number of node‐positive patients who did not recur, GC scores still provided relatively better discrimination as evidenced by nonoverlapping 95% confidence intervals of its median values. Black line and gray box represents median score and its associated 95% confidence interval, respectively. Blue and red dots represent patients who did not recur and recurred, respectively. P values were determined by Wilcoxon rank‐sum test. CC = “clinical‐only” classifier; GC = genomic classifier.

Mitra AP et al.



Supplementary Figure 9. Major GO terms associated with the 15 GC markers. First‐degree partners of the constituent features within GC were identified and a GO term enrichment of the associated genes was assessed. Each red node represents significantly enriched pathway associated with a GO term, and green links represent overlapping genes between pathways. Thickness of green links represents the frequency of overlapping genes between the GO terms. GO terms representing similar functions are grouped within dotted blue ellipses, and super‐grouped within dotted red ellipses. GC = genomic classifier; GO = Gene Ontology; MAPK = mitogen‐activated protein kinase.


Mitra AP et al.


Supplementary Figure 10. Performance of GC and previously reported prognostic genomic signatures for muscle‐invasive bladder cancer as assessed by survival‐ROC analysis in the discovery and validation sets for predicting postcystectomy recurrence. As all predictors including GC were optimized on the discovery set, their AUCs within this subgroup of patients were comparable. However, upon blinded assessment in the validation set, GC had the highest AUC of all genomic predictors. Circles and whiskers represent AUC and associated 95% confidence intervals, respectively. AUCs for predictors are also listed under the respective patient sets. Names of individual predictors are as referenced in Supplementary Table 3. Dotted orange line and shaded region highlights median AUC of GC and its 95% confidence intervals, respectively, in the validation set. ROC = receiver‐ operating characteristic; AUC = area under ROC curve; GC = genomic classifier.

Mitra AP et al.


FOXO6 HSD17B7 ARID4B ENAH MAP4K3 MARCH7 MECOM LRBA MUT CRCP SYPL1 ARFGEF1 EHF METTL7A PPP1R12A

1† 2† 3† 4† 5‡ 6† 7§ 8† 9† 10† 11† 12† 13† 14† 15†

1p34.2 1q23 1q42.1–q43 1q42.12 2p22.1 2q24.2 3q26.2 4q31.3 6p12.3 7q11.21 7q22.3 8q13 11p12 12q13.12 12q15–q21

Chromosomal location

.0065 .0138 .0310 .0028 .0327 .0275 .0045 .0081 .0232 .0259 .0485 .0354 .0022 .0201 .0406

t‐test P

1.6514 1.5401 1.5427 1.6644 1.5895 1.5425 1.5113 1.5432 1.5150 1.5144 1.5328 1.5580 1.7470 1.6586 1.5429

MFD

0.7704 0.7282 0.6593 0.7107 0.7521 0.7420 0.6345 0.6373 0.6107 0.7319 0.6777 0.6777 0.6437 0.7337 0.6630

AUC

.0005 .0016 .0142 .0012 .0003 .0006 .0275 .0286 .0940 .0013 .0064 .0075 .0545 .0012 .0049

t‐test P

1.9620 1.3508 1.5130 1.5311 1.5929 1.6120 1.1318 1.4009 1.2203 1.9790 1.5042 1.4951 1.1631 1.8840 1.3429

MFD

Recurrence‐specific feature metrics in validation set















 

 





Regulation of transcription





Cell differentiation

 



Cell proliferation/ cell‐cycle regulation/ apoptosis







 

 



Signaling pathways/ signal transduction

Important biological process(es)

Methyltransferase activity Cell adhesion

Lipid metabolism

Protein ubiquitination

Angiogenesis regulation Cholesterol metabolism Histone methylation

Other

AUC = area under receiver‐operating characteristic curve; MFD = median fold difference.

Biological processes summarized using Gene Ontology Annotation database [Dimmer EC, Huntley RP, Alam‐Faruque Y, et al. The UniProt‐GO Annotation database in 2011. Nucleic Acids Res. 2012;40(Database issue):D565–D570].

AUC

0.6251 0.6234 0.6223 0.6617 0.6067 0.6020 0.6434 0.6261 0.6206 0.6135 0.6064 0.6322 0.6437 0.6128 0.6104

* All statistical tests were two‐sided. † Exonic feature ‡ Exonic antisense feature § Intronic antisense feature

Associated gene

Feature and type

Recurrence‐specific feature metrics in discovery set

Supplementary Table 1. Summary description of features comprising the genomic classifier for predicting postcystectomy bladder cancer recurrence*


Page 24 of 27 │ Supplementary Information │ JNCI Mitra AP et al.


Supplementary Table 2. Multivariable analysis comparing genomic‐clinicopathologic versus clinical‐only models in the validation set* Relative risk of recurrence

Hazard ratio (95% CI)

Model 1 G‐IBCNC IBCNC Model 2 G‐CC CC

P

1.18 (1.03 ‐ 1.36) 1.04 (0.90 ‐ 1.20) 1.18 (1.04 ‐ 1.33) 1.10 (0.92 ‐ 1.31)

.016 .62 .008 .30

* Hazard ratio estimated by Cox proportional hazards analysis with ridge regression. All statistical tests were two‐sided.

CI = confidence interval; IBCNC = postcystectomy recurrence nomogram from the International Bladder Cancer Nomogram Consortium; G‐IBCNC = integrated genomic‐IBCNC classifier; CC = “clinical‐only” classifier; G‐CC = integrated genomic‐CC classifier.

Mitra AP et al.


74 ¶

40 ¶

38 ¶

Sanchez‐Carbayo 2006

Blaveri 2005

Kim 2011

5

4

24

85

20

44

49

15 61

(mappable transcripts, n)

Signature size

Progression

Overall survival

Overall survival

Overall survival

Progression

Cancer‐specific survival

Recurrence‐free survival Overall survival

Prognostic endpoint

0.82 (0.75 ‐ 0.90)

0.83 (0.76 ‐ 0.90)

0.86 (0.80 ‐ 0.93)

0.79 (0.71 ‐ 0.89)

0.82 (0.75 ‐ 0.90)

0.82 (0.75 ‐ 0.91)

0.83 (0.75 ‐ 0.91) 0.85 (0.78 ‐ 0.92)

Survival‐ROC AUC (95% CI)

1.66 (1.43 ‐ 1.92)

1.81 (1.53 ‐ 2.15)

1.98 (1.63 ‐ 2.42)

1.60 (1.36 ‐ 1.88)

1.64 (1.40 ‐ 1.93)

1.84 (1.51 ‐ 2.24)

1.36 (1.23 ‐ 1.49) 2.19 (1.75 ‐ 2.75)

Hazard ratio (95% CI)

Discovery set