A multifactorial integrative analysis scheme for combined mRNA and microRNA expression datasets

Authored by  A Sewer, V Belcastro, A Hengstermann*, C Mathis, J Hoeng

Presented at RECOMB-DREAM 2011 - Annual Dream on Reverse Engineering Challenges     
* This author is not affiliated with PMI.


IntroductionExtracting the relevant signal from high-throughput ‘omics’ datasets is a challenging task, which is essentially due to the high noise level arising from both biological and technical origins. Strategies to combine data from different sources have proved to be powerful, improving the quality of the results and allowing more successful interpretation. Here, we consider the case of combined microRNA and gene expression data from a multi-factorial in vitro experiment.ObjectiveOur objective was to develop methodologies to extract exploitable information from microRNA expression data: first to characterize the microRNA response to the experimental treatment and then to use complementary data sources such as microRNA target predictions and gene expression data to integrate the relevant microRNAs into regulatory interaction networks associated with the biological processes of interest.Material and MethodsTo characterize the smoke-induced perturbations in this in vitro system, organotypic normal, human-derived bronchial epithelial cell cultures were exposed directly to cigarette smoke (CS) using several exposure and post-exposure times. Expression data. Exiqon miRCURY LNA™ technology was used to profile the microRNAs and the Affymetrix HG-U133 plus 2.0 platform was used to generate the gene expression data. MicroRNA target predictions. TargetScan results were used to provide a list of candidate microRNA-gene regulatory interactions, which will be further filtered based on the results of the combined expression analysis. Computational methods. Differential expressions were determined using linear models based on the experimental design. Combined analysis was based on the Pearson correlations between microRNA and gene expression matrices, both previously transformed using the same linear model as for differential expression calculation.ResultsPrincipal component analysis on the microRNA expression matrix indicated that CS induced differential expression accounts for a small fraction (8%) of the total variance. The calculation of the corresponding differential expressions revealed a subset of ~30 microRNAs that significantly responded to the cigarette smoke exposure. The results of the integrative analyses indicated that these microRNAs are associated mainly with proliferation and inflammation processes and suggested further functional activities that have not yet been experimentally confirmed.ConclusionsThe computational tool selected was decisive in filtering the results and thereby ensuring their reliability. In particular, computing the microRNA differential expressions required a carefully chosen linear model to produce a satisfactory outcome. In spite of the incomplete knowledge of the microRNA functions, the integration of several data sources enabled the identification of several relevant biological processes.