Peer-Reviewed Publications

      An algorithm for score aggregation over causal biological networks based on random walk sampling

      Vasilyev, D. M.; Thomson, T. M.; Frushour, B. P.; Martin, F.; Sewer, A.
      Published
      Aug 11, 2014
      DOI
      10.1186/1756-0500-7-516
      PMID
      25113603
      Topic
      Summary

      Background: We recently published in BMC Systems Biology an approach for calculating the perturbation amplitudes of causal network models by integrating gene differential expression data. This approach relies on the process of score aggregation, which combines the perturbations at the level of the individual network nodes into a global measure that quantifies the perturbation of the network as a whole. Such "bottom-up" aggregation relates the changes in molecular entities measured by omics technologies to systems-level phenotypes. However, the aggregation method we used is limited to a specific class of causal network models called "causally consistent", which is equivalent to the notion of balance of a signed graph used in graph theory. As a consequence of this limitation, our aggregation method cannot be used in the many relevant cases involving "causally inconsistent" network models such as those containing negative feedbacks. Findings: In this note, we propose an algorithm called "sampling of spanning trees" (SST) that extends our published aggregation method to causally inconsistent network models by replacing the signed relationships between the network nodes by an appropriate continuous measure. The SST algorithm is based on spanning trees, which are a particular class of subgraphs used in graph theory, and on a sampling procedure leveraging the properties of specific random walks on the graph. This algorithm is applied to several cases of biological interest. Conclusion: The SST algorithm provides a practical means of aggregating nodal values over causally inconsistent network models based on solid mathematical foundations. We showed its utility in systems biology, where the nodal values can be perturbation amplitudes of protein activities or gene differential expressions, while the networks can be models of cellular signaling or expression regulation. Since the SST algorithm is based on general graph-theoretical considerations, it is scalable to arbitrary graph sizes and can potentially be used for performing quantitative analyses in any context involving signed graphs.