Global assessment of network inference algorithms based on available literature of gene/protein interactions
We propose a framework that uses the available gene/protein interaction databases of the literature as a universal benchmark in order to globally assess the inference performances of gene network inference algorithms. We also developed an R software package for convenient use of the framework, which can also be used in general as a quick tool to search in the literature for available validations of interactions. We applied the proposed approach to 2 publicly available prostate cancer gene expression datasets and a large breast cancer gene expression dataset. The results revealed different aspects and superiority of algorithms that had not been compared previously in the available literature. Our approach allowed the assessing and comparing of the algorithms on a real dataset of a size of around 30,000 probes, which showed the strengths and weaknesses of the algorithms from different points of view rather than conventional approaches. We further show that our approach provides a unique advantage in assessing the performance of an inference method when applied to a new dataset and thus sheds light on the results of a de novo application, which would be obscure without our approach.
Global assessment of network inference algorithms based on available literature of gene/protein interactions
We propose a framework that uses the available gene/protein interaction databases of the literature as a universal benchmark in order to globally assess the inference performances of gene network inference algorithms. We also developed an R software package for convenient use of the framework, which can also be used in general as a quick tool to search in the literature for available validations of interactions. We applied the proposed approach to 2 publicly available prostate cancer gene expression datasets and a large breast cancer gene expression dataset. The results revealed different aspects and superiority of algorithms that had not been compared previously in the available literature. Our approach allowed the assessing and comparing of the algorithms on a real dataset of a size of around 30,000 probes, which showed the strengths and weaknesses of the algorithms from different points of view rather than conventional approaches. We further show that our approach provides a unique advantage in assessing the performance of an inference method when applied to a new dataset and thus sheds light on the results of a de novo application, which would be obscure without our approach.
___
- Altay G (2012). Empirically determining the sample size for largescale gene network inference algorithms. IET Syst Biol 6: 35–
- Altay G, Asim M, Markowetz F, Neal DE (2011). Differential C3NET reveals disease networks of direct physical interactions. BMC Bioinformatics 12: 296.
- Altay G, Emmert-Streib F (2010a). Inferring the conservative causal core of gene regulatory networks. BMC Syst Biol 4: 132.
- Altay G, Emmert-Streib F (2010b). Revealing differences in gene network inference algorithms on the network-level by ensemble methods. Bioinformatics 26: 1738–1744.
- Altay G, Emmert-Streib F (2011). Structural influence of gene networks on their inference: analysis of C3NET. Biol Direct 6: Butte A, Tamayo P, Slonim D, Golub T, Kohane I (2000). Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. P Natl Acad Sci USA 97: 12182–12186
- Ceol A, Chatr Aryamontri A, Licata L, Peluso D, Briganti L, Perfetto L, Castagnoli L, Cesareni G (2009). The molecular interaction database 2009 update. Nucleic Acids Res 37: D532–D539.
- Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y et al. (2012). The genomic and transcriptomic architecture of 2000 breast tumours reveals novel subgroups. Nature 486: 346–352.
- Demir İ, Eryüzlü E, Demirbağ Z (2012). A study on the characterization and pathogenicity of bacteria from Lymantria dispar L. (Lepidoptera: Lymantriidae). Turk J Biol 36: 459–468.
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS (2007). Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5: e8.
- Fisher RA (1934). Statistical Methods for Research Workers. Edinburgh, UK: Oliver and Boyd.
- Fury W, Batliwalla F, Gregersen PK, Li W (2006). Overlapping probabilities of top ranking gene lists, hypergeometric distribution, and stringency of gene selection criterion. Conf Proc IEEE Eng Med Biol Soc 1: 5531–5534.
- Kerrien S, Aranda B, Breuza L, Bridge A, Broackes-Carter F, Chen C, Duesbury M, Dumousseau M, Feuermann M, Hinz U et al. (2012). The IntAct molecular interaction database in 2012. Nucleic Acids Res 40: D841–D846.
- Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A et al. (2009). Human Protein Reference Database - 2009 Update. Nucleic Acids Res 37: D767–772.
- Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006). ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7: S7.
- Meyer P, Kontos K, Bontempi G (2007). Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol 1: 79879.
- Meyer P, Lafitte FK, Bontempi G (2008). minet: r/bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics 9: 461.
- Narendra V, Lytkin NI, Aliferis CF, Statnikov A (2011) A comprehensive assessment of methods for de-novo reverse-engineering of genome-scale regulatory networks. Genomics 97: 7–18.
- Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM (2005). Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol 6: R40.
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T (2005). Towards a proteome-scale map of the human protein-protein interaction network. Nature 437: 1173–1178.
- Schadt E (2009). Molecular networks as sensors and drivers of common human diseases. Nature 461: 218–223.
- Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D’Amico AV, Richie JP et al. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1: 203–209.
- Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M. (2006). BioGrid: a general repository for interaction datasets. Nucleic Acids Res 34: D535–D539.
- Stolovitzky G, Prill RJ, Califano A (2009). Lessons from the DREAM2 challenges. Ann NY Acad Sci 1158: 159–195.
- Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK, Kaushik P, Cerami E, Reva B et al. (2010). Integrative genomic profiling of human prostate cancer. Cancer Cell 18: 11–
- Varışlı L, Çen O (2007). Identification and characterization of the rat DVL2 gene using bioinformatic tools. Turk J Biol 31: 81–86.
- Wang K, Banerjee N, Margolin AA, Nemenman I, Califano A (2006). Genome-wide discovery of modulators of transcriptional interactions in human B lymphocytes. Lect Notes Comput Sci 3909: 348–362.