System Biology and Machine Learning Framework for Prostate Cancer Survival Prediction


  • Utpala Nanda Chowdhury Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
  • A. F. M. Mahbubur Rahman Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
  • Md. Omar Faruqe Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
  • M. Babul Islam Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh
  • Shamim Ahmad Department of Computer Science and Engineering, University of Rajshahi, Rajshahi 6205, Bangladesh


Prostate Cancer, Gene expression, RNA-Seq, Survival analysis, Biomarker


Prostate cancer (PC) is the most commonly diagnosed and the second most lethal malignancy in men. Proper understanding about the factors influencing the disease mechanism, response to the treatment and long term survival could facilitate effective disease management, treatment planning and decision making. Previous research initiatives reported a number of genes having impact on PC development but their genetic influence on the overall survival of the patients is still obscure. In this study, we fist identified PC related signature genes by analysing the RNA-seq transcriptomic data. Then we investigated the influence of those genes on the survival of PC patients using the clinical and transcriptomic data from the Cancer Genome Atlas (TCGA). Considering the univariate and multivariate analysis using the Cox proportional-hazards (CoxPH) model, we evidenced notable variation in the survival period between the altered and normal groups for two genes (APLN, and DUOXA1). We also identified ten hub genes such as CAV1, RHOU, TUBB4A, RRAS, EFNB1, ZWINT, MYL9, PPP3CA, FGFR2 and GATA3 in protein-protein interaction analysis that could be the source of potential therapeutic intervention. Moreover, several significant molecular pathways through functional enrichment analysis was obtained. After verification through functional studies, the identified genetic determinants could serve as therapeutic target for prolonged PC survival.


Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A. and Jemal, A., 2018. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians, 68(6), pp.394-424.

American Cancer Society. Global Cancer Facts & Figures 4th Edition. Atlanta: American Cancer

Society; 2018.

Baade, P.D., Youlden, D.R. and Krnjacki, L.J., 2009. International epidemiology of prostate cancer: geographical distribution and secular trends. Molecular nutrition & food research, 53(2), pp.171-184. [4] Prostate Cancer Foundation. Prostate Cancer Survival Rates. (Accessed on 2021, June 25).

Bostwick, D.G., Burke, H.B., Djakiew, D., Euling, S., Ho, S.M., Landolph, J., Morrison, H., Sonawane, B., Shifflett, T., Waters, D.J. and Timms, B., 2004. Human prostate cancer risk factors. Cancer: Interdisciplinary International Journal of the American Cancer Society, 101(S10), pp.23712490.

Dagnelie, P.C., Schuurman, A.G., Goldbohm, R.A. and Van den Brandt, P.A., 2004. Diet, anthropometric measures and prostate cancer risk: a review of prospective cohort and intervention studies. BJU international, 93(8), pp.1139-1150.

Pienta, K.J. and Esper, P.S., 1993. Risk factors for prostate cancer. Annals of internal medicine,

(10), pp.793-803.

Hossain, M.A., Islam, S.M.S., Quinn, J.M., Huq, F. and Moni, M.A., 2019. Machine learning and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality. Journal of biomedical informatics, 100, p.103313.

Hossain, M.J., Chowdhury, U.N., Islam, M.B., Uddin, S., Ahmed, M.B., Quinn, J.M. and Moni,

M.A., 2021. Machine Learning and Network-Based Models to Identify Genetic Risk Factors to the Progression and Survival of Colorectal Cancer. Computers in Biology and Medicine, p.104539.

Huang, Z., Yang, Q. and Huang, Z., 2018. Identification of critical genes and five prognostic biomarkers associated with colorectal cancer. Medical science monitor: international medical journal of experimental and clinical research, 24, p.4625.

Dhanasekaran, S.M., Barrette, T.R., Ghosh, D., Shah, R., Varambally, S., Kurachi, K., Pienta, K.J., Rubin, M.A. and Chinnaiyan, A.M., 2001. Delineation of prognostic biomarkers in prostate cancer. Nature, 412(6849), pp.822-826.

Sardana, G., Dowell, B. and Diamandis, E.P., 2008. Emerging biomarkers for the diagnosis and prognosis of prostate cancer. Clinical chemistry, 54(12), pp.1951-1960.

Kim, J.H., Dhanasekaran, S.M., Prensner, J.R., Cao, X., Robinson, D., Kalyana-Sundaram, S., Huang, C., Shankar, S., Jing, X., Iyer, M. and Hu, M., 2011. Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer. Genome research, 21(7), pp.1028-1041.

Teslow, E.A., Bao, B., Dyson, G., Legendre, C., Mitrea, C., Sakr, W., Carpten, J.D., Powell, I. and Bollig-Fischer, A., 2018. Exogenous IL-6 induces mRNA splice variant MBD2 v2 to promote stemness in TP53 wild-type, African American PCa cells. Molecular oncology, 12(7), pp.1138-1152.

Itkonen, H.M., Brown, M., Urbanucci, A., Tredwell, G., Lau, C.H., Barfeld, S., Hart, C., Guldvik, I.J., Takhar, M., Heemers, H.V. and Erho, N., 2017. Lipid degradation promotes prostate cancer cell survival. Oncotarget, 8(24), p.38264.

Cerami, E., Gao, J., Dogrusoz, U., Gross, B.E., Sumer, S.O., Aksoy, B.A., Jacobsen, A., Byrne, C.J., Heuer, M.L., Larsson, E. and Antipin, Y., 2012. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data.

Al Mahi, N., Najafabadi, M.F., Pilarczyk, M., Kouril, M. and Medvedovic, M., 2019. GREIN: An interactive web platform for re-analyzing GEO RNA-seq data. Scientific reports, 9(1), pp.1-9.

Robinson, M.D., McCarthy, D.J. and Smyth, G.K., 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), pp.139-140.

Chen, E.Y., Tan, C.M., Kou, Y., Duan, Q., Wang, Z., Meirelles, G.V., Clark, N.R. and Ma’ayan, A., 2013. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC bioinformatics, 14(1), pp.1-14.

De Las Rivas, J. and Fontanillo, C., 2010. Protein–protein interactions essentials: key concepts to building and analyzing interactome networks. PLoS Comput Biol, 6(6), p.e1000807.

Szklarczyk, D., Morris, J.H., Cook, H., Kuhn, M., Wyder, S., Simonovic, M., Santos, A., Doncheva, N.T., Roth, A., Bork, P. and Jensen, L.J., 2016. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic acids research, p.gkw937.

Zhou, G., Soufan, O., Ewald, J., Hancock, REW, Basu, N. and Xia, J., 2019. NetworkAnalyst 3.0:

a visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucleic Acids Research 47 (W1): W234-W241.

Aalen, O.O., 1989. A linear regression model for the analysis of life times. Statistics in medicine,

(8), pp.907-925.

Liu, Y., Hu, C., Zhang, Q., Liu, W., Li, G., Tang, Q., Li, P., Lai, W., Zhou, M., Liu, Y. and Sheng, F., 2020. Identification of Key Genes and Molecular Mechanisms Associated With Docetaxel Resistance in Castration Resistant Prostate Cancer Based on Bioinformatics Analysis.

Thompson, T.C., Tahir, S.A., Li, L., Watanabe, M., Naruishi, K., Yang, G., Kadmon, D., Logothetis, C.J., Troncoso, P., Ren, C. and Goltsov, A., 2010. The role of caveolin-1 in prostate cancer: clinical implications. Prostate cancer and prostatic diseases, 13(1), pp.6-11.

Wang, R., Wu, Y., Yu, J., Yang, G., Yi, H. and Xu, B., 2020. Plasma messenger RNAs identified through bioinformatics analysis are novel, non-invasive prostate cancer biomarkers. OncoTargets and therapy, 13, p.541.

Hua, W., Zhong, W., Jiang, M., Ming, X.I., Wan, S., Jiang, F. and Wan, Y., 2018. Expression of Apelin in prostate cancer tissue and its correlation with clinical prognosis. Chinese Journal of Primary Medicine and Pharmacy, 25(12), pp.1545-1548.

Chen, Y., Lin, X., Zheng, J., Chen, J., Xue, H. and Zheng, X., 2021. APLN: A potential novel biomarker for cervical cancer. Science Progress, 104(2), p.00368504211011341.

Chen, J.H., He, H.C., Jiang, F.N., Militar, J., Ran, P.Y., Qin, G.Q., Cai, C., Chen, X.B., Zhao, J., Mo, Z.Y. and Chen, Y.R., 2012. Analysis of the specific pathways and networks of prostate cancer for gene expression profiles in the Chinese population. Medical Oncology, 29(3), pp.1972-1984.




How to Cite

Utpala Nanda Chowdhury, A. F. M. Mahbubur Rahman, Md. Omar Faruqe, M. Babul Islam, & Shamim Ahmad. (2022). System Biology and Machine Learning Framework for Prostate Cancer Survival Prediction. International Journal of Computer (IJC), 43(1), 129–138. Retrieved from