IMPROVED BAYESIAN ENSEMBLE CLASSIFICATION OF HIGH-DIMENSIONAL COLON CANCER DATA Inventors Dr. Mohd Asrul Affendi Bin-Abdullah Department of Mathematics and Statistics, Faculty of Science Technology and Human Development, Universiti Tun Hussein Onn Malaysia.
[email protected] Oyebayo Ridwan Olaniran Department of Mathematics and Statistics, Faculty of Science Technology and Human Development, Universiti Tun Hussein Onn Malaysia.
[email protected]
PRODUCT DESCRIPTION Improved Bayesian Ensemble Classification (IBEC) is an algorithm developed for non-clinical diagnosis of colon cancer messenger ribonucleic acid (mRNA) samples using R language. Based on the fact that most cancer types including colon cancer occur in four stages (cancer research UK, 2015), IBEC mimic the staging in colon cancer to build a robust predictive model that can detect the presence of colon cancer at any stage. Due to the alarming increase in cancer mortality in the recent time, the need for earlier detection has been the major focus of many researches in bioinformatics and biotechnology. Also, cancer can be well managed if diagnosed early. Specifically, IBEC process mRNA samples to identify the presence or absence of colon cancer and their associated genes features. IBEC algorithm uses microarray data obtained from analyzing genes expression levels of colon cancer. To demonstrate the procedure, we applied the IBEC to colon cancer microarray experiment by alon et al. (1999). Results using the data show that the performance of IBEC is on the high side in terms of sensitivity (correctly identifying colon cancer patients) and accuracy (correctly identifying colon/non-colon cancer patients).
NOVELTY
COLON CANCER STAGES
Flow chart of IBEC
IBEC algorithm incorporate staging in its model Cancerous or building, which makes it possible to detects the normal cells presence or absence of colon cancer at any stage. Also, it models the uncertainty in each of the model built at each stage to finally compute the output and relative importance of associated genes.
mRNA samples
Genes
APPLICATION USEFULLNESS
The data employed for this study were obtained from microarray Princeton repository on colon cancer. The data contained 2000 gene expression profiles measured on 62 biological samples that comprised 40 tumorous tissue samples and 22 normal tissue samples. The link to the database of the data is (http://microarray.princeton.edu/oncology/affydata/index.html).
•To provide an alternative non-clinical diagnosis of colon cancer. •To provide database information for the diagnosis of colon cancer. •To detect colon cancer at any stage of occurrence.
IBEC PERFORMANCE RESULTS
Table 1: Performance of the IBEC algorithm. Performance metrics IBEC 96.77% Accuracy 100.0% Sensitivity 4.80% False positive rate 0.00% False negative rate
COMMERCIALIZATION IBEC algorithm can be used to develop software that can be embedded in device to diagnose colon cancer. IBEC algorithm can be used to create database for existing cancer diagnostic device.
RESULTS PICTURE OF ALGORITHM CODE ON R STUDIO
CONCLUSION
PUBLICATION Olaniran, O. R., et al. (2016): Improved Bayesian Feature Selection and Classification Methods Using Bootstrap Prior Techniques; Anale. Seria Informatică. Vol. XIV fasc. 2. http://anale-informatica.tibiscus.ro/
% Importance
Relative importance of genes in colon cancer 100 90 80 70 60 50 40 30 20 10 0
• This work presented the possibility of non-clinical Ser diagnosis of colon cancer mRNA tissue samples using ie… the gene expression profiles of biological samples.
88.88 48.59
41.42 14.66
7.12
6.42
• A novel algorithm called IBEC for diagnosing and selecting informative gene biomarkers for classification of binary tumor classes is presented in this work. • Results from data calibration shows the viability of the algorithm.
Selected genes