Stable Gene Signature Selection for Prediction of Breast Cancer Recurrence Using Joint Mutual Information

Stable Gene Signature Selection for Prediction of Breast Cancer Recurrence Using Joint Mutual Information In this experiment, a gene selection technique was proposed to select a robust gene signature from microarray data for prediction of breast cancer recurrence. In this regard, a hybrid scoring criterion was designed as linear combinations of the scores that were determined in the mutual information (MI) domain and protein-protein interactions (PPI) network. Whereas, the MI-based score represents the complementary information between the selected genes for outcome prediction; and the number of connections in the PPI network between the selected genes builds the PPI-based score. All genes were scored by using the proposed function in a hybrid forward-backward gene-set selection process to select the optimum biomarker-set from the gene expression microarray data. The accuracy and stability of the finally selected biomarkers were evaluated by using five-fold cross-validation to classify available data on breast cancer patients into two cohorts of poor and good prognosis. The results showed an appealing improvement in the cross-dataset accuracy in comparison with similar studies whenever we applied a primary signature, which was selected from one dataset, to predict survival in other independent datasets. Moreover, the proposed method demonstrated 58-92% overlap between 50-genes signatures, which were selected from seven independent datasets individually.