Massive studies have indicated that long non-coding RNAs (lncRNAs) are critical for the regulation of cellular biological processes by binding with RNA-related proteins. However, only a few experimentally supported lncRNA-protein associations have been reported. Existing network-based methods are typically focused on intrinsic features of lncRNA and protein but ignore the information implicit in the topologies of biological networks associated with lncRNAs.
Considering the limitations in previous methods, researchers from Central South University propose PLPIHS, an effective computational method for Predicting lncRNA-Protein Interactions using HeteSim Scores. PLPIHS uses the HeteSim measure to calculate the relatedness score for each lncRNA-protein pair in the heterogeneous network, which consists of lncRNA-lncRNA similarity network, lncRNA-protein association network and protein-protein interaction network. An SVM classifier to predict lncRNA-protein interactions is built with the HeteSim scores. The results show that PLPIHS performs significantly better than the existing state-of-the-art approaches and achieves an AUC score of 0.97 in the leave-one-out validation test. The researchers also compare the performances of networks with different connectivity density and find that PLPIHS performs well across all the networks. Furthermore, they use the proposed method to identify the related proteins for lncRNA MALAT1. Highly-ranked proteins are verified by the biological studies and demonstrate the effectiveness of our method.
Flowchart of PLPIHS
It includes three steps: (1) constructing a heterogeneous network consisting of a lncRNA-lncRNA similarity network, a lncRNA-protein association network and a protein-protein interaction network; (2) using the HeteSim measure to calculate a score for each lncRNA-protein pair in each path; (3) combining the scores of different paths with a SVM classifier and making predictions.