CGMMC CGMMC : A Computational Approach to Identifying Gene-microRNA Modules in Cancer


2016/12/26: Source codes and data sets are updated.
In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria.
Paper Daeyong Jin and Hyunju Lee. "A computational approach to identifying gene-microrna modules in cancer." PLoS Comput Biol 11.1 (2015): e1004042.
Source codes
Source codes implemented in R and simulated example data sets (with result) are available here . Instructions for running source codes are here . Note that the attached input and output example files are not real data but have the same format with them. Required input files for each step are described in instructions.
Data sets
The glioblastoma (GBM) and ovarian cancer (OVC) data sets including the expression data of gene and miRNA used in this paper were collected from TCGA. The information about GBM and OVC related miRNAs was collected from Human miRNA & Disease Database v2.0 (HMDD v2.0). Protein-protein interaction data sets were collected from the Human Protein Reference Database (HPRD) (Prasad et al.,2009).
daeyong at, hyunjulee at