PCM Prioritizing cancer-related microRNAs by integrating microRNA and mRNA datasets


2016/12/02: Source codes and simulated data sets are updated.
In this study, we propose a novel method for the prioritization of candidate cancer-related miRNAs that may affect the expression of other miRNAs and genes across the entire biological network. For this, we propose three important features: the average expression of a miRNA in multiple cancer samples, the average of the absolute correlation values between the expression of a miRNA and expression of all genes, and the number of predicted miRNA target genes. These three features were integrated using order statistics.
Paper Daeyong Jin and Hyunju Lee. "Prioritizing cancer-related microRNAs by integrating microRNA and mRNA datasets." Scientific Reports 6, Article number: 35350 (2016)
Source codes
Source codes implemented in r and simulation input data are available here. Instructions for running source codes are here. Note that the attached input and output files are not real data but have the same format with them. Required input files for each step are described in instructions.
Data sets

The glioblastoma (GBM), ovarian cancer (OVC), prostate cancer (PRCA) and breast cancer (BRCA) data sets including the expression data of mRNA and miRNA used in this paper were collected from TCGA. The information about cancer-related miRNAs was collected from Human miRNA & Disease Database v2.0 (HMDD v2.0). Predicted gene-miRNA interaction pairs were collected from Pictar, TargetScans and microCosm.

daeyongjin at gist.ac.kr, hyunjulee at gist.ac.kr