- Open Access
Construction and analysis of protein–protein interaction networks
© Raman; licensee BioMed Central Ltd. 2010
- Received: 25 November 2009
- Accepted: 15 February 2010
- Published: 15 February 2010
Protein–protein interactions form the basis for a vast majority of cellular events, including signal transduction and transcriptional regulation. It is now understood that the study of interactions between cellular macromolecules is fundamental to the understanding of biological systems. Interactions between proteins have been studied through a number of high-throughput experiments and have also been predicted through an array of computational methods that leverage the vast amount of sequence data generated in the last decade. In this review, I discuss some of the important computational methods for the prediction of functional linkages between proteins. I then give a brief overview of some of the databases and tools that are useful for a study of protein–protein interactions. I also present an introduction to network theory, followed by a discussion of the parameters commonly used in analysing networks, important network topologies, as well as methods to identify important network components, based on perturbations.
- Cluster Coefficient
- Betweenness Centrality
- Functional Linkage
- Phylogenetic Profile
- Characteristic Path Length
Proteins are the main catalysts, structural elements, signalling messengers and molecular machines of biological tissues . Protein–protein interactions (PPIs) are extremely important in orchestrating the events in a cell. They form the basis for several signal transduction pathways in a cell, as well as various transcriptional regulatory networks. The availability of complete and annotated genome sequences of several organisms has led to a paradigm shift from the study of individual proteins in an organism to large-scale proteome-wide studies of proteins, which interact in a beautifully concerted network of metabolic, signalling and regulatory pathways in a cell. In general, the behaviour of a system is quite different from merely the sum of the interactions of its various parts. As Anderson put it as early as 1972, in his classic paper by the same title, "More is different"  — it is not possible to reliably predict the behaviour of a complex system, despite a good knowledge of the fundamental laws governing the individual components. Comparative genomics at a primary sequence level has also indicated that species differences are due more to the difference in the interactions between the component proteins, rather than the individual genes themselves . Consequently, several efforts have been made to identify these interactions, in an attempt to understand biological systems better [4–12]. The need to understand protein structure and function has been a critical driving force for biological research in the recent decades. With the advent of high-throughput experiments to identify PPIs, more knowledge on protein function has been obtained, together with the development of several methods to predict and study the interactions between proteins.
A wide variety of methods have been used to identify protein–protein associations; these associations may range from direct physical interactions inferred from experimental methods to functional linkages predicted on the basis of computational analyses. In the past, experimental methods based on microarrays and yeast two-hybrid, as well as computational methods based on protein sequences and structures have been developed and widely used. Given the difficulties in experimentally identifying PPIs, a wide range of computational methods have been used to identify protein–protein functional linkages and interactions. These methods range from identifying a single pair of interacting proteins at one end, to the identification and analysis of a large network of thousands of proteins, the latter as large as that of an entire proteome of a given cell.
Databases and resources useful for researching PPIs.
Peer-reviewed bio-molecular interaction database containing published interactions and complexes
Protein and genetic interactions from major model organism species
Orthology data and phylogenetic profiles
Experimentally determined interactions between proteins
Human protein functions, PPIs, post-translational modifications, enzyme–substrate relationships and disease associations
Interaction data abstracted from literature or from direct data depositions by expert curators
Physical interactions between those Pfam domains that have a representative structure in the Protein DataBank (PDB)
Experimentally verified PPI mined from the scientific literature by expert curators
Experimentally derived and computationally predicted functional linkages
Protein functional linkages
Domain–domain interactions and their interfaces derived from PDB structure files and SCOP domain definitions
Protein functional linkages from experimental data and computational predicttions
STRING (Search Tool for the Retrieval of Interacting Genes/Proteins; http://string.embl.de/) [36, 48] is a pre-computed database for the exploration and analysis of protein–protein associations. The associations are derived from high-throughput experimental data, mining of databases and literature, analyses of co-expressed genes and also from computational predictions, including those based on genomic context analysis. STRING employs a unique scoring framework based on benchmarks of the different types of associations against a common reference set, to produce a single confidence score per prediction. The graphical user interface is appealing and user-friendly, backed by an excellent visualisation engine. Medusa http://coot.embl.de/medusa/, a general graph visualisation tool, is a front end (interface) to the STRING protein interaction database .
Human Protein Reference Database (HPRD; http://www.hprd.org/)  integrates information relevant to the function of human proteins in health and disease. The database is almost completely manually curated by biologists who have read and interpreted over 300,000 published articles during the annotation process. Data pertaining to thousands of PPIs, post-translational modifications, enzyme/substrate relationships, disease associations, tissue expression and sub-cellular localisation have been extracted from literature into the database.
The DIP (Database of Interacting Proteins; http://dip.doe-mbi.ucla.edu/) database  catalogues experimentally derived PPIs. Due to the variety of experiments and their corresponding reliabilities, DIP applies some quality assessment methods to pick out subsets of most reliable interactions. The DIP is generally considered as a valuable benchmark or verify the performance of any new method for prediction of PPIs.
The Predictome  database houses links between the proteins of 44 genomes based on the implementation of gene context functional linkage methods, viz. chromosomal proximity, phylogenetic profiling and domain fusion. It also contains information on large-scale experimental screenings of PPI data, from experiments such as yeast two-hybrid, immuno-co-precipitation and correlated expression. The Predictome database is presently accessible through the visual front-end provided by VisANT , which is a versatile tool for visualisation and analysis of interaction data. Website http://visant.bu.edu/.
Tools for network analysis and visualisation
Examples of tools useful for the visualisation of networks and PPIs.
BioLayout Express 3D
Facilitates microarray data analysis
Versatile; implements many visualisation algorithms; many plug-ins available
Large Graph Layout (LGL)
Especially useful for dynamic visualisation of large graphs (105 nodes, 106 edges); force-directed layout algorithm
Provides network filters, connectivity filters, many layouts and facilitates dataset superimposing
Especially useful for the analysis of very large networks
Especially facilitates analysis of gene ontologies
General purpose graph editor
Cytoscape http://www.cytoscape.org/ is a software platform for visualising molecular interaction networks and integrating these interactions with gene expression profiles. The tool is best used in conjunction with large databases of gene expression data, protein–protein, protein–DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape supports several algorithms for the layout of networks. Several useful plug-ins are available for Cytoscape, to extend its capabilities. A notable example is the NetworkAnalyzer plug-in , which can be used to compute various network parameters.
Pajek http://pajek.imfm.si/ is a program (only for Windows-based operating systems) for the analysis and visualisation of very large networks; it can even handle networks with > 105 nodes. Pajek also includes a variety of network layout algorithms, including force-directed layout algorithms such as Fruchterman–Reingold . Pajek is highly versatile and can also be used to study network dynamics.
The field of network theory has witnessed a number of advances in the past [58–60], many of which are impacting the analyses of biological networks such as PPI networks. In this section, I discuss some of the important network parameters useful in the analysis of networks and understanding their characteristics, important network topologies, as well as some of the measures that can be used to analyse perturbations to networks. Detailed reviews of the application of network theory to biology have been published elsewhere [61, 62].
Network theory provides a quantifiable description of networks; there are several network measures that enable the comparison and characterisation of complex networks:
Connectivity (or) Degree
The most elementary characteristic of a node is its degree, k, which represents the number of links the node has, to other nodes in the network.
The degree distribution, P(k), gives the probability that a selected node has exactly k links. P(k) is obtained by counting the number of nodes N(k) with k = 1, 2, ... links and dividing by the number of nodes N. The degree distribution allows to distinguish between various network topologies .
The average clustering coefficient characterises the overall tendency of nodes to form clusters or groups. C(k) is defined as the average clustering coefficient for all nodes with k links.
Characteristic Path Length
The characteristic path length, L, is defined as the number of edges in the shortest path between two vertices, averaged over all pairs of vertices. It measures the typical separation between two vertices in the network . Intuitively, it represents the network's overall navigability .
where d G (u, v) is the shortest path between u and v in G. A few authors have also used this term to denote the average geodesic distance in a network (which translates to the characteristic path length), although strictly the two measures are distinct.
where σ st is the number of shortest paths from s to t, and σ st (v) is the number of shortest paths from s to t that pass through a vertex v. A similar definition for 'edge betweenness' was given by Girvan and Newman . Nodes with a higher betweenness lie on a larger number of shortest paths in a network.
The understanding of the topology or the architectural principles of a biological network can directly give an insight into various network characteristics. There are several known topologies of networks, characterised by their distinctive network parameters. The following are some network models that are relevant to the understanding of biological networks.
The Erdös–Rényi model of a random network starts with N nodes and connects each pair of nodes with a probability p, which creates a graph with approximately pN(N - 1)/2 randomly placed links. The node degrees follow a Poisson distribution indicating that most nodes have approximately the same number of links. The characteristic path length is proportional to the logarithm of the network size L ~ log N. C(k) is independent of k .
Small-world networks are characterised by two properties: (i) individual nodes have few neighbours, but (ii) most nodes can be reached from one another through few steps, often referred to as 'six degrees of separation' . Small-world networks have been generated by re-wiring regular ring-lattice-like networks . A regular ring-lattice resembles a (circular) string of beads, where each node (bead) is linked to one node on either side, and is also additionally connected to the immediate neighbour of those nodes. Thus, each node is linked to four nodes nearest to it on the 'string'. The ring-lattice is rewired as follows: the original links in the lattice are replaced by random ones with a probability 0 ≤ ϕ ≤ 1, introducing varying amounts of disorder, which takes the network from complete regularity to complete disorder (randomness). The re-wiring process allows the small-world model to interpolate between a regular lattice and a (more or less) random graph. When ϕ = 0, there is no re-wiring and the regular lattice remains unchanged. The clustering coefficient for this lattice tends to 0.75 for large k. The regular lattice, however, does not show the small-world effect. Mean geodesic distances between vertices tend to L/4k for large L. When ϕ = 1, every edge is re-wired to a new random location and the graph is almost a random graph, with typical geodesic distances on the order of log L/ log k, but very low C ≃ 2k/L . As Watts and Strogatz showed by numerical simulation, however, there exists a sizeable region in between these two extremes of ϕ, for which the model generates a network that has both low path lengths and high clustering. Small-world networks have a characteristic path length of the same order as random networks (L ≳ log N), but have a clustering coefficient much higher than that of random networks (C ≫ Crandom). The small-world topology has been observed in networks such as film actor networks, power grids and the neural network of the nematode Caenorhabditis elegans .
where k i is the degree of node i and the denominator represents the sum of the degrees of all nodes in the network (G). After n iterations, the model leads to a network with m0 + n nodes and mn edges. The network generated by this model has a power-law degree distribution characterised by γ = 3. Scale-free networks with 2 <γ < 3, a range commonly observed in many biological networks, are ultra-small, with a characteristic path length L ~ log log N, significantly smaller than that of random networks (log N) .
Analysis of network perturbations
Networks can be perturbed through the removal of nodes and edges. A typical analysis would be to probe the effect of disrupting a node and its corresponding edges. Networks of different topologies vary in their resilience to various types of perturbations. A number of studies have been carried out to analyse the response of networks to the deletion of their nodes and edges. A review of how nodes in a network can be prioritised based on network analysis has been presented elsewhere .
Barabási and co-workers have analysed the response of scale-free and random networks to various types of 'attacks' . In particular, they have analysed the networks representing the topologies of the Internet and the World-Wide Web. The common observation is that scale-free networks are quite insensitive to random node removals; they are highly robust in the face of random node failures and the characteristic path length was found to be almost unaffected. This is intuitively reasonable, since most of the vertices in these networks have low degree and therefore lie on few paths between others; thus their removal rarely affects communications substantially. On the other hand, directed attacks targeting the highly connected hubs led to a rapid disruption of the communication through the network. The characteristic path length was found to increase very sharply with the fraction of hubs removed and typically only a small fraction of the hubs needed to be 'knocked out' before essentially all communication through the network was destroyed [67, 69].
Jeong and co-workers have analysed the effect of node deletions on S. cerevisiae PPI network . They report that although proteins with five or fewer links constituted about 93% of the total number of proteins, only about 21% of them were essential. On the other hand, only 0.7% of the proteins had more than 15 links, but single deletion of 62% of these proved lethal. This implies that highly connected proteins with a central role in the architecture of the network are three times more likely to be essential than proteins with only a small number of links to other proteins.
Another comprehensive analysis of vulnerability of complex networks to various types of attacks has been discussed in . In addition to node deletions studied earlier , they have also studied the effects of edge removals. Further, for each case of attacks on vertices and edges, four different attacking strategies were employed: removals by the descending order of the degree and the betweenness centrality, calculated for either the initial network or the modified network during the iterative removal procedure. They report that the removals based on the re-calculated degrees and betweenness centralities are often more harmful than the attack strategies based on the initial network's parameters, underlining the importance of the changes in network structure following the removal of important edges or nodes.
Wingender and co-workers have proposed a measure, known as pairwise disconnectivity index , which quantifies how crucial a node or an edge (or a group of nodes/edges) is, for sustaining the communication between connected pairs of vertices in a directed network. This is one metric that explicitly considers paths between the various nodes in a network; it is thus quite useful in analysing how node deletions in a network can disrupt the flow of information.
We have earlier reported an analysis of the number of disrupted shortest paths in the network, to identify nodes that may be critical to a network . Network analysis has also been used for identifying pathways to drug resistance . Ge and collaborators have developed an 'information flow analysis', to identify proteins central for information transmission in interactome networks of S. cerevisiae and C. elegans ; the proteins so identified were also likely to be essential for survival. The method employs confidence scores for PPIs and also considers multiple paths in a network while evaluating the importance of each protein . The analysis of node deletions from PPI networks has been used for the identification of potential drug targets [73, 76].
PPI networks provide a simplified overview of the web of interactions that take place inside a cell. The vast amounts of sequence data that have been generated have been leveraged to make better predictions of interactions and functional associations between proteins, as well as individual protein functions. By integrating experimental methods for determining PPIs and computational methods for prediction, a lot of useful data on PPIs have been generated, including a number of high-quality databases.
Although the analyses of PPI networks has produced several useful results, often improving our understanding of the underlying biology, they are not without flaws. One of the key flaws of the existing methods to delineate such large-scale protein interaction networks is the limited reproducibility of such experiments; further, it is suspected that what is examined is only a small fraction of the entire proteome . However, most databases do combine multiple methods for predicting interactions, as well as results from multiple high-throughput experiments, mitigating this problem to a certain extent. Further, these networks often paint a static picture of the overwhelmingly complex dynamic interactions that take place in a cell. An improved model of these interactions must consider both the dynamics (temporal changes in the interactions) as well as the strengths of each of the interactions. The global overview presented by such interaction maps is no doubt useful, but the finer details of the interactions may be significantly important for our ability to make testable predictions about biological systems .
Nevertheless, protein interaction maps have many practical applications and hold the key to understanding complex biological systems. With a large amount of high-throughput data being generated at various levels, computational analyses of these data, to identify associations and interactions between various proteins, form a fundamental step in our quest to understand the organisation of complex biological systems. As Dennis Bray put it rather eloquently , "We have a new continent to explore and will need maps at every scale to find our way".
The author is grateful to Nagasuma Chandra and Andreas Wagner for their mentorship. Financial support through the YeastX project of SystemsX.ch is gratefully acknowledged.
- Eisenberg D, Marcotte EM, Xenarios I, Yeates TO: Protein function in the post-genomic era. Nature. 2000, 405 (6788): 823-826. 10.1038/35015694.View ArticlePubMedGoogle Scholar
- Anderson PW: More Is Different. Science. 1972, 177 (4047): 393-396. 10.1126/science.177.4047.393.View ArticlePubMedGoogle Scholar
- Valencia A, Pazos F: Computational methods for the prediction of protein interactions. Curr Opin Struct Biol. 2002, 12: 368-373. 10.1016/S0959-440X(02)00333-0.View ArticlePubMedGoogle Scholar
- Gavin AC, Bsche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hfert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415 (6868): 141-147. 10.1038/415141a.View ArticlePubMedGoogle Scholar
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, Yang L, Wolting C, Donaldson I, Schandorff S, Shewnarane J, Vo M, Taggart J, Goudreault M, Muskat B, Alfarano C, Dewar D, Lin Z, Michalickova K, Willems AR, Sassi H, Nielsen PA, Rasmussen KJ, Andersen JR, Johansen LE, Hansen LH, Jespersen H, Podtelejnikov A, Nielsen E, Crawford J, Poulsen V, Srensen BD, Matthiesen J, Hendrickson RC, Gleeson F, Pawson T, Moran MF, Durocher D, Mann M, Hogue CWV, Figeys D, Tyers M: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002, 415 (6868): 180-183. 10.1038/415180a.View ArticlePubMedGoogle Scholar
- Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley RL, White KP, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets RA, McKenna MP, Chant J, Rothberg JM: A protein interaction map of Drosophila melanogaster. Science. 2003, 302 (5651): 1727-1736. 10.1126/science.1090289.View ArticlePubMedGoogle Scholar
- Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JDJ, Chesneau A, Hao T, Goldberg DS, Li N, Martinez M, Rual JF, Lamesch P, Xu L, Tewari M, Wong SL, Zhang LV, Berriz GF, Jacotot L, Vaglio P, Reboul J, Hirozane-Kishikawa T, Li Q, Gabel HW, Elewa A, Baumgartner B, Rose DJ, Yu H, Bosak S, Sequerra R, Fraser A, Mango SE, Saxton WM, Strome S, Heuvel SVD, Piano F, Vandenhaute J, Sardet C, Gerstein M, Doucette-Stamm L, Gunsalus KC, Harper JW, Cusick ME, Roth FP, Hill DE, Vidal M: A map of the interactome network of the metazoan C. elegans. Science. 2004, 303 (5657): 540-543. 10.1126/science.1091403.PubMed CentralView ArticlePubMedGoogle Scholar
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick ME, Hill DE, Roth FP, Vidal M: Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005, 437 (7062): 1173-1178. 10.1038/nature04209.View ArticlePubMedGoogle Scholar
- Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrn-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, Onge PS, Ghanny S, Lam MHY, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637-643. 10.1038/nature04670.View ArticlePubMedGoogle Scholar
- Arifuzzaman M, Maeda M, Itoh A, Nishikata K, Takita C, Saito R, Ara T, Nakahigashi K, Huang HC, Hirai A, Tsuzuki K, Nakamura S, Altaf-Ul-Amin M, Oshima T, Baba T, Yamamoto N, Kawamura T, Ioka-Nakamichi T, Kitagawa M, Tomita M, Kanaya S, Wada C, Mori H: Large-scale identification of protein-protein interaction of Escherichia coli K-12. Genome Res. 2006, 16 (5): 686-691. 10.1101/gr.4527806.PubMed CentralView ArticlePubMedGoogle Scholar
- Ewing RM, Chu P, Elisma F, Li H, Taylor P, Climie S, McBroom-Cerajewski L, Robinson MD, O'Connor L, Li M, Taylor R, Dharsee M, Ho Y, Heilbut A, Moore L, Zhang S, Ornatsky O, Bukhman YV, Ethier M, Sheng Y, Vasilescu J, Abu-Farha M, Lambert JP, Duewel HS, Stewart II, Kuehl B, Hogue K, Colwill K, Gladwish K, Muskat B, Kinach R, Adams SL, Moran MF, Morin GB, Topaloglou T, Figeys D: Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol Syst Biol. 2007, 3: 89-10.1038/msb4100134.PubMed CentralView ArticlePubMedGoogle Scholar
- Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, Barabsi AL, Tavernier J, Hill DE, Vidal M: High-quality binary protein interaction map of the yeast interactome network. Science. 2008, 322 (5898): 104-110. 10.1126/science.1158684.PubMed CentralView ArticlePubMedGoogle Scholar
- Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D: Detecting Protein Function and Protein-Protein Interactions from Genome Sequences. Science. 1999, 285 (5428): 751-753. 10.1126/science.285.5428.751.View ArticlePubMedGoogle Scholar
- Veitia RA: Rosetta Stone proteins: "chance and necessity"?. Genome Biol. 2002, 3 (2): interactions1001.1-1001.3. 10.1186/gb-2002-3-2-interactions1001.View ArticleGoogle Scholar
- Huynen M, Snel B, Lathe W, Bork P: Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res. 2000, 10 (8): 1204-1210. 10.1101/gr.10.8.1204.PubMed CentralView ArticlePubMedGoogle Scholar
- Dandekar T, Snel B, Huynen MA, Bork P: Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochemical Sci. 1998, 23 (9): 324-328. 10.1016/S0968-0004(98)01274-2.View ArticleGoogle Scholar
- Marcotte EM: Computational genetics: finding protein function by nonhomology methods. Curr Opin Struct Biol. 2000, 10: 359-365. 10.1016/S0959-440X(00)00097-X.View ArticlePubMedGoogle Scholar
- Korbel JO, Jensen LJ, von Mering C, Bork P: Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat Biotechnol. 2004, 7: 911-917. 10.1038/nbt988.View ArticleGoogle Scholar
- Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO: Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. Proc Natl Acad Sci USA. 1999, 96 (8): 4285-4288. 10.1073/pnas.96.8.4285.PubMed CentralView ArticlePubMedGoogle Scholar
- Morett E, Korbel JO, Rajan E, Saab-Rincon G, Olvera L, Olvera M, Schmidt S, Snel B, Bork P: Systematic discovery of analogous enzymes in thiamin biosynthesis. Nat Biotechnol. 2003, 21: 790-795. 10.1038/nbt834.View ArticlePubMedGoogle Scholar
- Date SV, Marcotte EM: Protein function prediction using the Protein Link Explorer (PLEX). Bioinformatics. 2005, 21 (10): 2558-2559. 10.1093/bioinformatics/bti313.View ArticlePubMedGoogle Scholar
- Thompson J: The Coevolutionary Process. 1994, Chicago: University of Chicago PressView ArticleGoogle Scholar
- Pazos F, Valencia A: Protein co-evolution, co-adaptation and interactions. EMBO J. 2008, 27 (20): 2648-2655. 10.1038/emboj.2008.189.PubMed CentralView ArticlePubMedGoogle Scholar
- Barker D, Pagel M: Predicting functional gene links from phylogenetic-statistical analyses of whole genomes. PLoS Comput Biol. 2005, 1: e3-10.1371/journal.pcbi.0010003.PubMed CentralView ArticlePubMedGoogle Scholar
- Pazos F, Helmer-Citterich M, Ausiello G, Valencia A: Correlated mutations contain information about protein-protein interaction. J Mol Biol. 1997, 271 (4): 511-523. 10.1006/jmbi.1997.1198.View ArticlePubMedGoogle Scholar
- Pazos F, Valencia A: In silico Two-Hybrid System for the Selection of Physically Interacting Protein Pairs. Proteins. 2002, 47: 219-227. 10.1002/prot.10074.View ArticlePubMedGoogle Scholar
- Goh CS, Cohen FE: Co-evolutionary analysis reveals insights into protein-protein interactions. J Mol Biol. 2002, 324: 177-192. 10.1016/S0022-2836(02)01038-0.View ArticlePubMedGoogle Scholar
- Ramani AK, Marcotte EM: Exploiting the co-evolution of interacting proteins to discover interaction specificity. J Mol Biol. 2003, 327: 273-284. 10.1016/S0022-2836(03)00114-1.View ArticlePubMedGoogle Scholar
- Pazos F, Ranea JAG, Juan D, Sternberg MJE: Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J Mol Biol. 2005, 352 (4): 1002-1015. 10.1016/j.jmb.2005.07.005.View ArticlePubMedGoogle Scholar
- Juan D, Pazos F, Valencia A: High-confidence prediction of global interactomes based on genome-wide coevolutionary networks. Proc Natl Acad Sci USA. 2008, 105 (3): 934-939. 10.1073/pnas.0709671105.PubMed CentralView ArticlePubMedGoogle Scholar
- Mika S, Rost B: Protein-protein interactions more conserved within species than across species. PLoS Comput Biol. 2006, 2 (7): e79-10.1371/journal.pcbi.0020079.PubMed CentralView ArticlePubMedGoogle Scholar
- Komurov K, White M: Revealing static and dynamic modular architecture of the eukaryotic protein interaction network. Mol Syst Biol. 2007, 3: 110-10.1038/msb4100149.PubMed CentralView ArticlePubMedGoogle Scholar
- Lu X, Jain VV, Finn PW, Perkins DL: Hubs in biological interaction networks exhibit low changes in expression in experimental asthma. Mol Syst Biol. 2007, 3: 98-10.1038/msb4100138.PubMed CentralView ArticlePubMedGoogle Scholar
- Hegde SR, Manimaran P, Mande SC: Dynamic changes in protein functional linkage networks revealed by integration with gene expression data. PLoS Comput Biol. 2008, 4 (11): e1000237-10.1371/journal.pcbi.1000237.PubMed CentralView ArticlePubMedGoogle Scholar
- Donaldson I, Martin J, de Bruijn B, Wolting C, Lay V, Tuekam B, Zhang S, Baskin B, Bader GD, Michalickova K, Pawson T, Hogue CWV: PreBIND and Textomy - mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics. 2003, 4: 11-10.1186/1471-2105-4-11.PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005, 33 (Suppl 1): D433-437.PubMed CentralPubMedGoogle Scholar
- Marcotte EM, Xenarios I, Eisenberg D: Mining literature for protein-protein interactions. Bioinformatics. 2001, 17 (4): 359-363. 10.1093/bioinformatics/17.4.359.View ArticlePubMedGoogle Scholar
- Zaki N, Lazarova-Molnar S, El-Hajj W, Campbell P: Protein-protein interaction based on pairwise similarity. BMC Bioinformatics. 2009, 10: 150-10.1186/1471-2105-10-150.PubMed CentralView ArticlePubMedGoogle Scholar
- Fields S, Song O: A novel genetic system to detect protein-protein interactions. Nature. 1989, 340 (6230): 245-246. 10.1038/340245a0.View ArticlePubMedGoogle Scholar
- Gingras AC, Gstaiger M, Raught B, Aebersold R: Analysis of protein complexes using mass spectrometry. Nat Rev Mol Cell Biol. 2007, 8 (8): 645-654. 10.1038/nrm2208.View ArticlePubMedGoogle Scholar
- Zhu H, Bilgin M, Bangham R, Hall D, Casamayor A, Bertone P, Lan N, Jansen R, Bidlingmaier S, Houfek T, Mitchell T, Miller P, Dean RA, Gerstein M, Snyder M: Global analysis of protein activities using proteome chips. Science. 2001, 293 (5537): 2101-2105. 10.1126/science.1062191.View ArticlePubMedGoogle Scholar
- Michaud GA, Salcius M, Zhou F, Bangham R, Bonin J, Guo H, Snyder M, Predki PF, Schweitzer BI: Analyzing antibody specificity with whole proteome microarrays. Nat Biotechnol. 2003, 21 (12): 1509-1512. 10.1038/nbt910.View ArticlePubMedGoogle Scholar
- Mattoon DR, Schweitzer B: Profiling protein interaction networks with functional protein microarrays. Methods Mol Biol. 2009, 563: 63-74. full_text.View ArticlePubMedGoogle Scholar
- Shoemaker BA, Panchenko AR: Deciphering protein-protein interactions. Part I. Experimental techniques and databases. PLoS Comput Biol. 2007, 3 (3): e42-10.1371/journal.pcbi.0030042.PubMed CentralView ArticlePubMedGoogle Scholar
- Uetz P: Experimental methods for protein interaction identification and characterization. Protein-protein interactions and networks. Springer, 2008: 1-32.Google Scholar
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, Rothberg JM: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature. 2000, 403 (6770): 623-627. 10.1038/35001009.View ArticlePubMedGoogle Scholar
- Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001, 98 (8): 4569-4574. 10.1073/pnas.061034498.PubMed CentralView ArticlePubMedGoogle Scholar
- Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C: STRING 8-a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 2009, D412-D416. 10.1093/nar/gkn760. 37 DatabaseGoogle Scholar
- Hooper SD, Bork P: Medusa: a simple tool for interaction graph analysis. Bioinformatics. 2005, 21 (24): 4432-4433. 10.1093/bioinformatics/bti696.View ArticlePubMedGoogle Scholar
- Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TKB, Gronborg M, Ibarrola N, Deshpande N, Shanker K, Shivashankar HN, Rashmi BP, Ramya MA, Zhao Z, Chandrika KN, Padma N, Harsha HC, Yatish AJ, Kavitha MP, Menezes M, Choudhury DR, Suresh S, Ghosh N, Saravana R, Chandran S, Krishna S, Joy M, Anand SK, Madavan V, Joseph A, Wong GW, Schiemann WP, Constantinescu SN, Huang L, Khosravi-Far R, Steen H, Tewari M, Ghaffari S, Blobe GC, Dang CV, Garcia JGN, Pevsner J, Jensen ON, Roepstorff P, Deshpande KS, Chinnaiyan AM, Hamosh A, Chakravarti A, Pandey A: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003, 13 (10): 2363-2371. 10.1101/gr.1680803.PubMed CentralView ArticlePubMedGoogle Scholar
- Xenarios I, Fernandez E, Salwinski L, Duan XJ, Thompson MJ, Marcotte EM, Eisenberg D: DIP: The Database of Interacting Proteins: 2001 update. Nucleic Acids Res. 2001, 29: 239-241. 10.1093/nar/29.1.239.PubMed CentralView ArticlePubMedGoogle Scholar
- Mellor JC, Yanai I, Clodfelter KH, Mintseris J, DeLisi C: Predictome: a database of putative functional links between proteins. Nucleic Acids Res. 2002, 30: 306-309. 10.1093/nar/30.1.306.PubMed CentralView ArticlePubMedGoogle Scholar
- Hu Z, Snitkin ES, DeLisi C: VisANT: an integrative framework for networks in systems biology. Brief Bioinform. 2008, 9 (4): 317-325. 10.1093/bib/bbn020.PubMed CentralView ArticlePubMedGoogle Scholar
- Pavlopoulos G, Wegener AL, Schneider R: A survey of visualization tools for biological network analysis. BioData Min. 2008, 1: 12-10.1186/1756-0381-1-12.PubMed CentralView ArticlePubMedGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504. 10.1101/gr.1239303.PubMed CentralView ArticlePubMedGoogle Scholar
- Assenov Y, Ramírez F, Schelhorn SE, Lengauer T, Albrecht M: Computing topological parameters of biological networks. Bioinformatics. 2008, 24 (2): 282-284. 10.1093/bioinformatics/btm554.View ArticlePubMedGoogle Scholar
- Fruchterman TMJ, Reingold EM: Graph drawing by force-directed placement. Softw Pract Exper. 1991, 21 (11): 1129-1164. 10.1002/spe.4380211102.View ArticleGoogle Scholar
- Watts DJ, Strogatz SH: Collective dynamics of 'small-world' networks. Nature. 1998, 393 (6684): 440-442. 10.1038/30918.View ArticlePubMedGoogle Scholar
- Barabási AL, Albert R: Emergence of Scaling in Random Networks. Science. 1999, 286 (5439): 509-512. 10.1126/science.286.5439.509.View ArticlePubMedGoogle Scholar
- Albert R, Jeong H, Barabási AL: Diameter of the World-Wide Web. Nature. 1999, 401: 130-131. 10.1038/43601.View ArticleGoogle Scholar
- Barabási AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet. 2004, 5 (2): 101-113. 10.1038/nrg1272.View ArticlePubMedGoogle Scholar
- Mason O, Verwoerd M: Graph theory and networks in Biology. IET Syst Biol. 2007, 1 (2): 89-119. 10.1049/iet-syb:20060038.View ArticlePubMedGoogle Scholar
- Diestel R: Graph Theory. Graduate Texts in Mathematics. 2000, Springer-Verlag, 173:Google Scholar
- Freeman LC: A set of measures of centrality based on betweenness. Sociometry. 1977, 40: 35-41. 10.2307/3033543.View ArticleGoogle Scholar
- Girvan M, Newman MEJ: Community structure in social and biological networks. Proc Natl Acad Sci USA. 2002, 99 (12): 7821-7826. 10.1073/pnas.122653799.PubMed CentralView ArticlePubMedGoogle Scholar
- Watts D: Six Degrees. 2003, London: W. W. Norton & CompanyGoogle Scholar
- Newman MEJ: The Structure and Function of Complex Networks. SIAM Review. 2003, 45 (2): 167-256. 10.1137/S003614450342480.View ArticleGoogle Scholar
- Chang AN: Prioritizing genes for pathway impact using network analysis. Methods Mol Biol. 2009, 563: 141-156. full_text.View ArticlePubMedGoogle Scholar
- Albert R, Jeong H, Barabási AL: Error and attack tolerance of complex networks. Nature. 2000, 406 (6794): 378-382. 10.1038/35019019.View ArticlePubMedGoogle Scholar
- Jeong H, Mason SP, Barabási AL, Oltvai ZN: Lethality and centrality in protein networks. Nature. 2001, 411 (6833): 41-42. 10.1038/35075138.View ArticlePubMedGoogle Scholar
- Holme P, Kim BJ, Yoon CN, Han SK: Attack vulnerability of complex networks. Phys Rev E. 2002, 65 (5): 056109-10.1103/PhysRevE.65.056109.View ArticleGoogle Scholar
- Potapov AP, Goemann B, Wingender E: The pairwise disconnectivity index as a new metric for the topological analysis of regulatory networks. BMC Bioinformatics. 2008, 9: 227-10.1186/1471-2105-9-227.PubMed CentralView ArticlePubMedGoogle Scholar
- Raman K, Kalidas Y, Chandra N: targetTB: A target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis. BMC Syst Biol. 2008, 2: 109-PubMed CentralView ArticlePubMedGoogle Scholar
- Raman K, Chandra N: Mycobacterium tuberculosis interactome analysis unravels potential pathways to drug resistance. BMC Microbiol. 2008, 8: 234-10.1186/1471-2180-8-234.PubMed CentralView ArticlePubMedGoogle Scholar
- Missiuro PV, Liu K, Zou L, Ross BC, Zhao G, Liu JS, Ge H: Information flow analysis of interactome networks. PLoS Comput Biol. 2009, 5 (4): e1000350-10.1371/journal.pcbi.1000350.PubMed CentralView ArticlePubMedGoogle Scholar
- Raman K, Vashisht R, Chandra N: Strategies for efficient disruption of metabolism in Mycobacterium tuberculosis from network analysis. Mol Biosyst. 2009, 5: 1740-1751. 10.1039/b905817f.View ArticlePubMedGoogle Scholar
- von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002, 417 (6887): 399-403. 10.1038/nature750.View ArticlePubMedGoogle Scholar
- Bray D: Molecular networks: the top-down view. Science. 2003, 301 (5641): 1864-1865. 10.1126/science.1089118.View ArticlePubMedGoogle Scholar
- Bader GD, Betel D, Hogue CWV: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 2003, 31: 248-250. 10.1093/nar/gkg056.PubMed CentralView ArticlePubMedGoogle Scholar
- Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006, D535-D539. 10.1093/nar/gkj109. 34 DatabaseGoogle Scholar
- Tatusov RL, Koonin EV, Lipman DJ: A genomic perspective on protein families. Science. 1997, 278 (5338): 631-637. 10.1126/science.278.5338.631.View ArticlePubMedGoogle Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003, 4: 41-10.1186/1471-2105-4-41.PubMed CentralView ArticlePubMedGoogle Scholar
- Prasad TSK, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S, Somanathan DS, Sebastian A, Rani S, Ray S, Kishore CJH, Kanth S, Ahmed M, Kashyap MK, Mohmood R, Ramachandra YL, Krishna V, Rahiman BA, Mohan S, Ranganathan P, Ramabadran S, Chaerkady R, Pandey A: Human Protein Reference Database-2009 update. Nucleic Acids Res. 2009, D767-D772. 10.1093/nar/gkn892. 37 DatabaseGoogle Scholar
- Aranda B, Achuthan P, Alam-Faruque Y, Armean I, Bridge A, Derow C, Feuermann M, Ghanbarian AT, Kerrien S, Khadake J, Kerssemakers J, Leroy C, Menden M, Michaut M, Montecchi-Palazzi L, Neuhauser SN, Orchard S, Perreau V, Roechert B, van Eijk K, Hermjakob H: The IntAct molecular interaction database in 2010. Nucleic Acids Res. 2009-10.1093/nar/gkp878.Google Scholar
- Finn RD, Marshall M, Bateman A: iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions. Bioinformatics. 2005, 21 (3): 410-412. 10.1093/bioinformatics/bti011.View ArticlePubMedGoogle Scholar
- Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G: MINT: the Molecular INTeraction database. Nucleic Acids Res. 2007, D572-D574. 10.1093/nar/gkl950. 35 DatabaseGoogle Scholar
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 2004, 5: R35-10.1186/gb-2004-5-5-r35.PubMed CentralView ArticlePubMedGoogle Scholar
- Winter C, Henschel A, Kim WK, Schroeder M: SCOPPI: a structural classification of protein-protein interfaces. Nucleic Acids Res. 2006, D310-D314. 10.1093/nar/gkj099. 34 DatabaseGoogle Scholar
- Freeman TC, Goldovsky L, Brosch M, van Dongen S, Mazire P, Grocock RJ, Freilich S, Thornton J, Enright AJ: Construction, visualisation, and clustering of transcription networks from microarray expression data. PLoS Comput Biol. 2007, 3 (10): 2032-2042. 10.1371/journal.pcbi.0030206.View ArticlePubMedGoogle Scholar
- Adai AT, Date SV, Wieland S, Marcotte EM: LGL: creating a map of protein function with an algorithm for visualizing very large biological networks. J Mol Biol. 2004, 340: 179-190. 10.1016/j.jmb.2004.04.047.View ArticlePubMedGoogle Scholar
- Breitkreutz BJ, Stark C, Tyers M: Osprey: a network visualization system. Genome Biol. 2003, 4 (3): R22-10.1186/gb-2003-4-3-r22.PubMed CentralView ArticlePubMedGoogle Scholar
- Batagelj V, Mrvar A: Pajek - Program for Large Network Analysis. Connections. 1998, 21: 47-57. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.