If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
The past decade has witnessed exponential growth in the generation of high-throughput human data across almost all known dimensions of biological systems. The discipline of network medicine has rapidly evolved in parallel, providing an unbiased, comprehensive biological framework through which to interrogate and integrate systematically these large-scale, multi-omic data to enhance our understanding of disease mechanisms and to design drugs that reflect a deep knowledge of molecular pathobiology. In this review, we discuss the key principles of network medicine and the human disease network and explore the latest applications of network medicine in this multi-omic era. We also highlight the current conceptual and technological challenges, which serve as exciting opportunities by which to improve and expand the network-based applications beyond the artificial boundaries of the current state of human pathobiology.
The traditional reductionistic approach that links a single (or limited number of) molecular mediator(s) to a disease phenotype (pathophenotype) provides limited insight into our understanding of complex human diseases. This conventional approach, however, remains deeply rooted in the contemporary scientific method and currently serves as a barrier to a comprehensive understanding of complex disease phenotypes and optimal drug development. Combining systems biology and network science, network medicine approaches complex pathobiology as the sequela of perturbations that involve closely linked multiple molecular network components rather than a direct consequence of a single gene or molecular defect.
For the past decade, there has been rapid growth in our ability to acquire and process high-throughput data in the multiple domains of modern molecular biology, including genomic, transcriptomic, epigenetic, proteomic, and metabolomic analyses. Network medicine serves as a powerful tool by which to integrate these multidimensional data and to improve our understanding of the connections among the multiple layers of genomic features and deep clinical phenotypes.
In this review, we first introduce the basic principles of network medicine and the structure of human disease networks. We then detail the specific ways by which network science is being applied to various large-scale, high-throughput data modalities with examples. We also review how network medicine can be applied to reappraise disease classification and facilitate rational drug design. In addition, we highlight recent studies that exemplify novel applications of network medicine and conclude by discussing current challenges and future directions of this rapidly expanding discipline.
Basic Principles and Key Components of Network Medicine
Networks can be used to represent a broad range of biological systems. A biological network consists of multiple nodes that represent distinct individual biological entities, such as genes, proteins, or metabolites. Relationships among nodes are depicted by connecting lines and denoted edges, which may represent a wide variety of molecular interactions, such as gene regulation, physical protein-protein interaction, or substrate metabolism (Figure 1). Nodes that have more frequent interactions with other nodes are called hubs and tend to play key roles in biological processes. Importantly, genes for chronic, complex diseases are typically located in the periphery of the network, with relatively fewer connections compared with hubs.
Degree refers to the number of edges to which a node is connected. Identification of the node with the highest degree can help identify a biological entity with the most central role within a network (degree centrality) (Figure 1). The strengths of interactions can be weighted (with edge thickness related to the weight of the association) (Figure 1A) or unweighted (Figure 1B). Edges can be directed (Figure 1C) or undirected (Figure 1B). These relationships are visualized and analyzed using graph theory. A connection between two nodes created by following the edges that link them is called a path. The shortest path length is the minimal number of edges required to connect two nodes (Figure 1).
Figure 1Basic network properties and network types. Nodes represent distinct biological entities and are connected by edges. Hubs are highly connected nodes and often represent essential biological elements. Shortest path lengths represent the minimal number of edges connecting two nodes. A and B: Edges can be unweighted or weighted (with varying thicknesses) to signify the different strengths of given biological interactions. C: Edges can also be used to depict the directionality of chosen molecular interactions. D: The relationships between more than one type of node (depicted as circles and squares) can be represented in a bipartite network. E and F: Scale-free properties of biological networks. In random networks, the degree distribution [P(k)] follows a binomial distribution. Most biological networks are scale-free networks with their P(k) following a power law distribution. Only a small number of nodes (hubs) are highly connected. Adapted from Loscalzo et al
The degree distribution of most biological networks typically follows a power law, where most nodes have low degrees (sparsely connected to other nodes) and a small number of nodes (hubs) have high degree (highly connected to other nodes) (Figure 1, E and F). These networks are scale free in that the slope reflecting the relationship between the logarithmic values of a degree (k) and the probability of k is constant regardless of size of the network.
The scale-free properties of biological networks facilitate the flow of information across the network with minimal transition time. The overdetermined nature of scale-free networks allows insight into network function even when certain network components are missing (or knowledge of the network is incomplete). These networks are also characterized by emergent behavior, which highlights the important notion that a complete understanding of an individual network component in isolation does not predict its behavior within the network setting or the behavior of the overall network.
Clustering within networks reflects the fact that neighbors of a node are also connected to each other. The local clustering coefficient is used to quantify how dense a network is. Certain recurring patterns of nodal interactions that represent specific biological functions are called motifs, which have proven to be useful for predicting network function, especially in regulatory networks. Topologic subunits within networks that contain densely connected nodes are called network modules or communities.
These topologic communities or modules tend to carry distinct biological properties as exemplified by the identification of multiple modules enriched with specific biological pathways (eg, coagulation, chemotaxis, opsonization) in network analysis of the inflammatory responses after myocardial infarction (MI).
Most established networks represent static relationships among biological entities (eg, topological relationships). There is, however, increasing interest in creating dynamic biological network models that capture nonstatic network properties, such as flux analyses. Bipartite networks consist of more than one type of node, representing multiple biological entities (Figure 1D). Gene regulatory networks with their nodes representing regulators or target genes are typical examples. Thus, biological systems are best understood as a collection of multiple dynamic and static networks whose interactions determine the outcomes of biological (and pathobiological) processes. Because biological networks follow nongaussian distributions, standard statistical tests that rely on normality cannot be applied. Instead, randomized network topologies are typically used to compare the statistical significance of scale-free network-based findings with those of random expectations for a similarly sized system.
Building the Human Interactomes and Disease Networks
Protein-Protein Interaction Networks
Protein-protein interaction (PPI) data have been widely used to examine the molecular relationships among intracellular network components and to construct an interactome. In a PPI network, nodes depict proteins and edges represent physical interactions among the proteins. These interactions have been extensively curated by literature and database searches or predicted by computer-based algorithms with experimental validation.
The current human PPI catalog is estimated to cover approximately 25% of all possible interactions, whereas the rest remain as yet undetected or unexamined.
There has been a large-scale proteomic-level effort to test all possible PPIs using techniques such as yeast two-hybrid screens or affinity purification–mass spectrometry. Methods that detect binary interactions, such as yeast two-hybrid screens, are better suited to detect transient interactions, whereas those that assess co-complex associations, such as affinity purification–mass spectrometry, are biased toward detecting more stable protein complexes and more abundant proteins. Inspection bias is inherent in the networks created using current curation approaches because frequently studied proteins are better represented in the literature and in databases. Additional limitations of the current PPI-based network analysis are discussed in the following sections. Ongoing efforts include improving PPI prediction algorithms,
Figure 2The human interactome and disease network modules. A: The human interactome represents unbiased mapping of all known biological interactions. The disease network modules can be constructed by placing a set of genes or gene products that are differentially expressed in specific disease states in the context of the interactome. B: This subnetwork of the human interactome contains the disease modules specific to multiple sclerosis (MS), peroxisomal disorders (PD), and rheumatoid arthritis (RA). The molecular relationships among different disease entities can be examined in this context. C: The degree of topological overlap between the MS and RA disease modules is represented by a Venn diagram. The network-based separation of a disease pair, A and B, is defined as SAB = <dAB> − (<dAA>+ <dBB>)/2. This compares the mean shortest distances of the protein pairings between the diseases A and B (dAB) and the shortest distances of the protein pairings within the disease A (dAA) and B (dBB). Negative SAB indicates overlap between the diseases A and B; SAB for MS and RA is −0.2. The graph below depicts the probability distribution [P(d)] as a function of the shortest distances. D: The MS and PD disease modules show no topologic overlap with positive network-based separation value (SAB = 1.3). From Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, Barabasi AL: Disease networks: uncovering disease-disease relationships through the incomplete interactome. Science 2015, 347:1257601.
Adapted with permission from AAAS. From Leopold JA, Loscalzo J: Emerging role of precision medicine in cardiovascular disease. Circ Res 2018, 122:1302–1315.
A disease network module is defined as a subgroup of interactive nodes whose altered states (eg, gene deletions, mutations, copy number variations, or differential expressions) are associated with specific disease phenotypes. Disease modules can be constructed by mapping a set of genes or gene products (proteins) that are altered (mutated) or differentially expressed in individuals with specific disease phenotypes into an established human interactome
The disease module hypothesis posits that genes and gene products associated with a given disease are more likely to interact and segregate with one another in a local subnetwork than be distributed randomly throughout the human interactome.
The study of proteins that interact with known disease-associated gene products in the human interactome and their subnetworks has enhanced our understanding of various disease mechanisms. Examples include but are not limited to Alzheimer disease,
In addition to this linkage-based method, novel disease genes have been identified by assessing their presence within an established disease network (disease module method)
Because the human interactome continues to expand, disease network–based studies will continue to improve our understanding of complex disease mechanisms.
Human Disease Networks and Disease-Disease Relationship
The study of disease networks has also allowed an unbiased examination of pathobiological relationships among different disease processes. Using a computational approach that examined 299 diseases each with at least 20 known disease-associated genes in the context of the incomplete human interactome, Menche et al
established a method to examine molecular relationships among different diseases even in the absence of shared disease genes. They demonstrated that disease modules with greater topologic overlap within the human interactome tend to share similar biological processes or clinical phenotypes more closely than the modules that are mapped far apart from one another (Figure 2). For example, rheumatoid arthritis and multiple sclerosis were noted to have highly overlapping disease modules (Figure 2, B and C), whereas gene products associated with multiple sclerosis and peroxisomal disorders were separated by a greater (noneuclidean) distance in the human interactome (Figure 2, B and D). Identification of previously unrecognized, shared mechanisms between separate disease entities has become possible with this disease network–based approach
Of note, only 10% of human genes have known disease associations (Online Mendelian Inheritance in Man, www.omim.org, last accessed January 31, 2019). Owing to the incomplete nature of the current human interactome and the disease gene list, a large number of disease genes may appear to be scattered across the human interactome without forming detectable subnetworks. This limitation can be overcome and a more cohesive disease subnetwork structure can be inferred by the iterative addition of genes or proteins known to have significant interactions with known disease genes (also called seed genes) as they become increasingly available from, for example, unbiased genomic screens. A number of computational methods have been developed to approach these relationships analytically, including the Disease Module Detection method
The simple presence or absence of disease genes cannot adequately explain complex clinical phenotypes. Edge-centric models help shift our attention to permutations specific to the interactions between network components (edges) to understand better complex genotype-phenotype relationships.
Human senataxin protein biology highlights the importance of edgetic perturbations in pathobiology. Different mutations in the same amino-terminal domain of senataxin result in clinically distinct phenotypes (autosomal dominant amyotrophic lateral sclerosis type 4 or autosomal recessive ataxia with oculomotor apraxia type 2) via their differential effects on the interactions with protein-binding partners.
Protein interaction analysis of senataxin and the ALS4 L389S mutant yields insights into senataxin post-translational modification and uncovers mutant-specific binding with a brain cytoplasmic RNA-encoded peptide.
There is ongoing effort to profile systematically a large number of gene mutations, their effects on molecular interactions (edgotyping), and their phenotypic correlations.
A PPI analysis of 23 inherited cerebellar ataxias using 23 seemingly unrelated inherited mutations in distinct genes found that many of them not only interacted with one another but also shared a number of binding partners that are known genetic modifiers in Purkinje cell (patho)biological (apoptotic) pathways.
In recent years, there has been increasing recognition of post-translational modifications (PTMs) and their contributions to human biology and pathobiology. Almost all proteins undergo PTMs, and PTMs have a ubiquitous and dynamic impact on PPIs.
In a temporal- and spatial-specific manner, they may stabilize, destabilize, or delete an individual node or multiple (protein) nodes within a biological network or strengthen, weaken, delete, or otherwise modify existing PPIs or create new ones. PTMs, therefore, should be considered as integral elements of biological networks. The human interactome to date, however, reflects a single form of PTM (ie, phosphorylation—the best studied PTM) of >200 known forms,
representing a clear challenge for the discipline. The challenges that involve integrating PTMs into PPI networks include the complexity of highly dynamic PTM biology,
recently developed an integrative bioinformatics tool for large-scale PTM discovery (iPTMnet), which combines PTM-relevant literature text mining, experimentally validated PTM databases, and protein ontology functions. This database currently covers eight PTM forms (phosphorylation, acetylation, ubiquitination, SUMOylation, glycosylation, methylation, S-nitrosylation, and myristoylation) with 654,500 PTM sites identified in >62,100 proteins, including >1200 PTM enzymes and 24,300 PTM enzyme-substrate associations. Machine learning approaches have been increasingly applied to PTM prediction and identification efforts.
Numerous PTMs that involve histone tails and their interactions exemplify the need to approach PTM biology in an integrative manner. Histone tail PTMs involve phosphorylation; acetylation; mono-, di-, or tri-methylation; ubiquitination; biotinylation; SUMOylation; Arg methylation; neddylation; and Glu ADP ribosylation.
Studies have demonstrated numerous interactions between these processes, including multidirectional modulation between phosphorylation and ubiquitination as the most extensively studied example.
Interplays between PTMs and other related biological regulatory mechanisms, such as PTM-epigenetic regulatory network interaction, should also be considered as we continue to build more comprehensive networks.
In this section, various types of major biological networks that accommodate systems-level -omics studies are explored, and how they are increasingly merging to enable the integrative, multidimensional assessment of complex human diseases is discussed.
Network Analysis of Genetic Studies
Genome-wide association studies (GWASs) have typically focused on the discovery of a single marker associated with a given disease. Complex human disease phenotypes can rarely be explained by a single gene (cause) because they involve contributions from multiple genes, gene products, and their interactions with one another and with the environment. Genes associated with complex human diseases typically have low statistical power in GWASs. Owing to stringent statistical criteria, a number of important genetic variants may not reach genome-wide significance and may, therefore, remain undetected under these circumstances.
In this post-GWAS era, genomic analysis integrated with the PPI network has enabled the assessments of broader groups of genes and variants and has helped identify novel disease pathways in multiple human diseases.
A number of computational tools have been developed to assist in the integration of GWAS and PPI data and prioritize findings for functional validation.
Limitations of genomic networks include a potential bias toward longer genes and genes that contain more characterized single-nucleotide polymorphisms (SNPs).
There currently is a limited understanding of the variable heritability (variable phenotypes, penetrance) in complex human diseases despite ongoing advances in whole-genome sequencing and more granular clinical phenotyping. Although this limitation may, in part, be attributable to unidentified genetic modifiers and phenotype heterogeneity, currently limited considerations for gene-gene interactions (epistasis) and genotypic context likely play a large role.
constructed statistical epistasis networks by examining pairwise interactions among 1422 SNPs from nearly 500 bladder cancer susceptibility genes. This disease-specific global epistasis network is scale free and has a topology distinct from a network generated topology, assuming no epistatic association. This epistasis network had significantly more hub SNPs, with the largest connected components including 39 SNPs. Limitations of this approach include the possibility of failing to account for important higher-order SNP interactions without pairwise interactions. Systematic assessment of epistasis at the genome-wide level remains challenging owing to a fundamental gap between currently available statistical epistasis models and true biological phenomena.
Generating network inference of gene regulatory mechanisms from large-scale gene expression data is largely poorly defined and remains an important challenge. One approach is based on gene expression correlations with the limitations that include the absence of directional information and difficulty distinguishing between direct or indirect interactions and coexpression with no regulatory relationships. Mutual information-based algorithms have been used to overcome such limitations, with ARACNe
as early examples. An alternative approach is to infer regulatory relationships based on the similarities of gene expression patterns compared with known transcription factors.
approached the gene regulatory network inference problems by decomposing them as separate regression problems with respect to each of the target genes in the system (GENIE3). For each regression problem, the expression pattern of a target gene was predicted based on those of all other genes (input genes), and the putative regulatory relationship could be inferred by assessing the importance of an input gene's expression in predicting the target gene's expression. All regulatory relationships were then aggregated across all genes to infer a full regulatory network. Additional approaches, including probabilistic graphical model–based methods, have proven to be of limited use in the setting of large-scale data analyses and were recently reviewed.
Single-Cell RNA Sequencing Technologies and Gene Regulatory Networks
With the widespread adaptation of single-cell transcriptomic analysis techniques across multiple disciplines, there is an ongoing effort to build an optimal network model to infer more complex gene regulatory relationships from markedly larger data sets.
In contrast to the technical limitations of bulk RNA sequencing for which averaging heterogeneous expression levels in different cells may cancel out true biological signals, the ability to quantify the exact gene expression levels in individual cells under a given set of conditions offers an unprecedented opportunity to accelerate our understanding of gene regulation across many cell types. The challenges that involve single cell–based network analysis include the relatively small number of genes being sequenced at a time and genes with low-level expressions being undetected (dropout).
These limitations can lead to missing gene-gene interactions and misrepresentation of zero-inflated genes as having higher correlations.
or by using the data diffusion method to infer missing information across similar cells and reconstructing missing gene-gene relationships to infer their regulatory information (Markov affinity-based graph imputation of cells
The cell heterogeneity within and across different cell types is another source of challenge to gene regulatory network inference from single-cell transcriptome data. A regulatory network constructed from the gene expression data of one cell population would likely differ from another network derived from a distinct cell population where cell type identification (clustering) is based on the similarities in their gene expression patterns. One approach that can be used to overcome this challenge would be to combine the cell type identification step and gene regulatory inference during the initial analysis.
Even within a single cluster of cells putatively representing a unique cell (sub)type, cell-to-cell biological variability is innately present owing to many factors, including transcriptional bursting and different stages of common biological processes, such as cell cycle or apoptosis. Together with multiple sources of technical variations, cell heterogeneity makes robust statistical inference difficult. Additional challenges include scaling up the network models to accommodate markedly larger numbers of samples (thousands of individual cells). Aibar et al
proposed a new method, Single-Cell Regulatory Network Inference and Clustering, by first creating co-expression modules with known transcription factors and then enriching for true direct interactions using cis-regulatory motif analyses with RcisTarget to select for the modules enriched with particular regulatory binding motifs. The regulon activities within each cell are then scored in a binary system matrix across all cell types, and this matrix is then subject to dimensionality reduction to analyze for specific cell types or biological states based on shared common regulatory mechanisms. This method was found to be robust to dropouts because the analysis involves scoring the regulatory subnetworks rather than single genes. Others have proposed algorithms based on the multivariate information theory (Partial Information Decomposition and Context)
As our understanding of epigenetic mechanisms and their central roles in human (patho)biology continues to increase rapidly, network-based approaches have also accelerated this effort. In silico analysis of the interactions between pulmonary hypertension network genes and miRNAs prioritized by a hypergeometric analysis led to the identification of microRNA (miR-21) as a key regulator of multiple biological pathways central to the pathogenesis of the disease.
Large-scale genome-wide investigations of DNA methylation in network perturbation settings have been widely adopted to investigate the durable effect of an earlier exposure or a chronic process, such as aging
With ongoing advances in bioinformatics, subsequent studies apply more integrative approaches, combining multiple layers of different epigenetics mechanisms with other multi-omic networks across a wide range of disease processes (networks of networks). A recently developed model, the Composite Network–Based Inference for miRNA-Disease Association Prediction, used a random walk method to integrate multiple networks mapping miRNA disease, long noncoding RNA disease, and miRNA and long noncoding RNA lncRNA interactions and then used this network of networks to infer novel miRNA-disease associations across multiple malignant tumors.
Remaining challenges include integration of the highly dynamic and plastic nature of epigenetic modification mechanisms into human gene regulatory networks, as well as the incorporation of temporal- and spatial-specificity elements to such networks. Although these efforts are nascent, ongoing collaborative initiatives, including the Human Epigenome Project and the NIH Roadmap Epigenome Project, represent major efforts to catalog systems-level condition- and tissue-specific epigenetic data. Lastly, the interactions among different epigenetic mechanisms and their interface with other biological networks (eg, epigenetic-PTM or epigenetic–metabolic networks) are important areas of active investigation in many human disease settings.
Epigenomic and transcriptomic approaches in the post-genomic era: path to novel targets for diagnosis and therapy of the ischaemic heart? Position Paper of the European Society of Cardiology Working Group on Cellular Biology of the Heart.
integrated the methylation and miRNA network data with the transcriptomic data derived from epicatechin-treated human umbilical vein endothelial cells and identified the multilevel regulatory effects of epicatechin metabolites on endothelial phenotypes, including the pathways modulating endothelial-leukocyte interactions and vascular permeability.
Metabolomics and Network Medicine
The study of endogenous and exogenous metabolites offers important insights into the upstream biochemical reactions in a biological system and can serve as an important intermediate that connects genotypes and phenotypes as well as drug actions. Earlier studies focused on linking a small number of metabolites to a specific disease state. With ongoing advances in MS-based technologies, the ability to generate unbiased, large-scale metabolomic data and interpret those data in a biological network context continues to increase rapidly.
Remaining challenges include the identification of unknown metabolites and their functional characterization, accurate modeling to estimate network connectivity and reaction kinetics, and standardized approaches
to high-throughput data to obtain biologically relevant information. These challenges are being actively addressed by the increasing number of important innovations in metabolomics data processing,
facilitate further efforts to place a metabolite in a relevant biological pathway. Systems-level metabolomics studies are also increasingly paired with other omic platforms.
Given the high degree of cross-talk among multiple biological systems, a perturbation in one domain inevitably leads to a change(s) in others. Our increasing ability to integrate large-scale, multidimensional biological data to unravel complex human (patho)biology and connect genotypes and (patho)phenotypes promises to provide unprecedented insight into disease mechanisms (Figure 3). The optimal network-based approach for this formidable task is yet to be determined.
Figure 3Integrating multidimensional biological networks. Studying the cross-talk among networks representing different biological domains and their integration (networks of networks) enables us to approach complex disease mechanisms and therapeutic decisions with greater precision. This integration should also involve network interactions with long-term and short-term environmental exposures, as well as microbiome interactions.
recently reviewed different conceptual approaches to this important problem and provide illustrative examples. One approach involves knowledge-based construction of a novel network by consolidating and constraining all available -omic databases. Genome-scale transcriptional regulatory networks have been reconstructed by integrating transcriptomic databases with transcription factor–binding site data. They were further expanded to include additional layers of regulatory mechanisms, including multiple epigenetic and chromosomal interaction data, in combination with PPI and PTM data (ENCODE Project Consortium).
An alternative approach is to use bioinformatics and machine learning tools to infer biological networks from the available multi-omic data. Zhang and colleagues
to integrate the transcriptomic, DNA methylation, and microRNA profiles from 385 ovarian cancer samples from the Cancer Genome Atlas subnetworks. The multidimensional module constructed by projecting the data from each dimension onto a common space revealed altered transforming growth factor-β signaling, a finding that became apparent only after integrating all three data sets. In addition, pathway knowledge has been widely used to integrate multidimensional data. The premise is that knowledge about genomic interactions from existing pathway databases could inform the study of altered genomic expression or interactions in a disease state. If the directionality of altered interactions among multiple biological entities within a given pathway is highly correlated, it was hypothesized that this pathway may reveal important biological processes about a disease state. One example of this integrative pathway approach is Pathway Representation and Analysis by Direct Reference on Graphical Models (PARADIGM), which uses a probabilistic graphical model to construct factor graphs to help integrate multidimensional patient data, including gene expression levels and gene copy numbers, with carefully curated pathway data to infer disease-associated alterations in pathways.
Distinct PARADIGM clusters of patients with glioblastoma were identified with clinical correlations, whereas clustering based on data from one biological dimension alone revealed no such correlations. Oncologic research is one of the leading areas in which PARADIGM or similar algorithms have been widely adopted to integrate multiplatform patient data.
Network Approach to Reappraise Complex Phenotypes and Reclassify Human Diseases
Despite our increasing ability to identify molecular changes associated with disease states, there remains a great discordance between these findings and our understanding of clinical phenotypic heterogeneity. In constructing the Human Disease Symptom Network interconnecting shared symptoms and shared gene networks that involve 133,106 interactions among 1596 disease entities, Zhou et al
observed that a disease with greater diversity in symptoms (phenotypes) was associated with more complex and diverse cellular and molecular mechanisms. With rapidly increasing multi-omic data and our ability to appreciate more granular phenotypic heterogeneity in complex diseases, network approaches have been applied to reappraise the conventional disease classification system, improve risk stratification, and potentially guide individualized treatment plans (Figure 4). The ongoing multicenter NIH- and National Heart, Lung, and Blood Institute–sponsored Redefining Pulmonary Hypertension through Pulmonary Vascular Disease Phenomics study is one such example wherein network-based integration of multi-omic molecular and deep phenotypic analyses will help lead to more accurate reclassification of pulmonary hypertension.
Figure 4Network approach to precision phenotype and rational polypharmacy. A: The conventional reductionistic therapy decision involves identifying patients with a common endophenotype (eg, obesity) who may otherwise have distinctive underlying biological phenotypes (depicted in red, blue, or green). Some patients may undergo limited genotyping. Drug therapies are initiated in this heterogeneous patient group based on population-based clinical trial results with limited consideration for the individual's underlying biology. B: The precision medicine approach involves deep phenotyping of each individual with a common endophenotype using multi-omic platforms and clinical assessments. These data are integrated using network analysis to determine more precise phenotypes with their key molecular components (gray nodes). On the basis of such analysis in the context of the interactome, a drug target(s) (red, blue, or green nodes) can be determined for each phenotype. From Leopold JA, Loscalzo J: Emerging role of precision medicine in cardiovascular disease. Circ Res 2018, 122:1302–1315.
A recent network analysis integrating 39 measured variables from invasive cardiopulmonary exercise testing involving 738 patients with uncharacterized exercise intolerance revealed four novel patient subgroup clusters with distinct clinical outcomes. On the basis of this phenotypic network model and its predictive value for clinical outcome, a novel risk stratification system was proposed,
representing another example of the application of network medicine to phenotype classification.
A novel polyhierarchical disease classification system has been proposed using the network-based integration of known molecular and phenotypic profiles of disease entities
(Figure 5). The new classification system consists of 233 overlapping disease subcategories, which subsequently are clustered to form 17 novel disease categories (new chapters). Compared with the International Classification of Diseases, Ninth Revision (ICD-9; https://www.cdc.gov/nchs/icd/icd9cm.htm, last accessed January 14, 2019) disease chapters, the new disease categories had higher modularity (reflective of more accurate representation of molecular and phenotypic associations) and overall shorter minimum shortest path lengths between the disease pairs that belonged to a same disease category in the PPI network (reflective of more shared genes). Under this new system, a disease with greater molecular diversity is appropriately reclassified into multiple disease chapters and subcategories. For example, malignant neoplasm of the pancreas (ICD-9 code 157) is reclassified into four new disease chapters and 11 subcategories, which accurately reflect the current understanding of molecular and phenotypic heterogeneity underlying pancreatic cancer biology. Overall, such molecular- and phenotype-based disease reclassification efforts promise to bring greater precision to disease taxonomy and derivative (precision) therapeutic approaches.
Figure 5Network approach to molecular- and phenotype-based disease classification. The integrated disease network is constructed from the multiple disease networks built based on database-curated disease-disease molecular or phenotypic associations. This integrated network consisted of 1857 nodes depicting distinct disease entities and 35,114 links depicting molecular or phenotypic associations among the disease pairs. A total of 233 overlapping communities were identified from this network. They represent novel disease subcategories that are distinct from the International Classification of Diseases, Ninth Revision (https://www.cdc.gov/nchs/icd/icd9cm.htm) chapters or their subcategories. These disease subcategories subsequently are placed in a network based on the shared disease entities (depicted by weighted links). Nonoverlapping community detection analysis leads to the identification of 17 distinct clusters of disease subcategories that represent the new chapter-level disease categories. These chapters contain varying numbers of disease entities and subcategories.
The rate of the current drug development process lags well behind the rapid advances made in biological sciences (structural biology and bioinformatics), largely because of the traditional one disease–one target reductionist approach.
In reality, most drugs affect more than one protein target with their net drug effect—including both therapeutic and adverse effects—best reflected by the changes they bring to a subnetwork of interconnected molecules.
Network approaches provide unique abilities with which to perform in silico predictions of drug action in a complex biological context. Drugs can be mapped to the human interactome through their (known) targets as part of a biological network with their effect on a node(s) (target) represented with edges (Figure 4, A and B).
When the disease genes associated with MI, currently available drugs for MI, and their drug targets were incorporated into the human PPI network, the disease genes and drug targets were mapped near one another.
Construction of a bipartite network, including the MI disease gene products and MI drug targets, led to the identification of 12 drug target–disease modules. These modules provided novel mechanistic insight into how drugs interact with various disease components and reshape the biological network.
Current bioinformatics-based curation of drug toxicity often is based on user reporting and likely gives only a partial picture of the true extent of the problem. Huang et al
used machine learning and logistic regression approaches to predict adverse drug-target interactions within an interactive network and demonstrated their superiority compared with a non–network-based method. An additional advantage of network-based assessment of an adverse outcome is that this information could be obtained during the early phases of drug target selection, thereby theoretically reducing development costs considerably.
Most complex diseases involve multiple biological pathways that contribute to disease outcomes. Network approaches to characterizing each of these disease-associated pathways and to carefully defining their interactions and regulatory control(s) may provide a powerful platform for rational polypharmacy and individualized therapy. Garmaroudi et al
Systems pharmacology and rational polypharmacy: nitric oxide-cyclic GMP signaling pathway as an illustrative example and derivation of the general case.
designed an integrative computational model that incorporated multiple well-characterized chemical reactions involved in the nitric oxide–guanylyl cyclase pathway. This algorithm, in turn, predicted the efficacy of combinatorial therapies that involve inhibitors of one, two, or three reactions. The model predicted that the combined inhibition of three key reactions had the most significant impact on cyclic guanosine monophosphate levels, which was significantly more efficacious than phosphodiesterase inhibitor monotherapy. These findings were validated in an experimental model, suggesting the possibility of achieving greater efficacy with a multitargeted approach. Exciting new directions include incorporation of various tools to create permutations in a selected network (eg, gene silencing in transcriptomic networks or antimetabolite treatment in metabolic networks) and observing the behavior of other related network dimensions to assess the molecular interconnections among different biological dimensions and predict drug actions across multiple biological systems.
In this section, the recent studies that exemplify novel applications of network analysis in disease pathobiology are highlighted. Topological networks of common intermediate phenotypes (endophenotypes) have been constructed by Ghiassian et al,
including literature-curated molecular determinants of inflammation, thrombosis, and fibrosis, in the context of the human PPI network. These endophenotype modules overlap in their network topology, revealing a significant number of shared mechanisms among them. These overlapping modules are highly enriched with the genes associated with diseases and disease risks and may serve as fruitful targets for therapeutic interventions applicable to multiple disease processes.
A network analysis of molecular mediators of the placebo treatment response enabled construction of the placebome module within the comprehensive human interactome. This module was notably enriched with brain-specific proteins and neurotransmission signaling components, and its network proximity to various disease modules within the human interactome correlated with the magnitude of the placebo effect in specific disease processes.
Constructing a subnetwork that integrates fibrosis-specific PPIs and aldosterone-regulated genes expressed by human endothelial cells led to the identification of NEDD9-mediated novel mechanisms of vascular fibrosis in pulmonary arterial hypertension.
In a recent study of patients with type 2 diabetes mellitus, the controllability of the network was examined using the control centrality measure to identify the high control centrality pathways in a pancreatic islet–specific gene regulatory network. NFATC4 was identified as one of the important variants that regulate gene expression in multiple high control centrality pathways, and in vitro silencing of its gene product in animal islets led to changes in the expression of multiple downstream genes implicated in type 2 diabetes pathobiology.
Controllability in an islet specific regulatory network identifies the transcriptional factor NFATC4, which regulates type 2 diabetes associated genes.
examined new drug-disease interactions involving over 900 FDA-approved drugs by measuring the network proximity between known drug targets and disease proteins in the human interactome. Selected novel drug-disease associations were validated using large-scale longitudinal patient databases and in vitro mechanistic assays. This study, thus, proposes a novel platform for drug repurposing that is PPI network-based.
Current Challenges and Opportunities
Network analysis of complex human diseases is currently limited by the incomplete nature of the current human interactome. We anticipate the interactome will continue to increase with ongoing advances in high-throughput technologies and bioinformatics.
The current interactome is based on the curated PPIs simply based on their binary biophysical relationships with limited understanding or annotations about protein-binding domains or motifs. There is an ongoing effort to build a domain-specific interactome where commonly shared domains or motifs, such as the SH3 domain or PDZ domain, are cloned and screened for their interacting partners.
Limitations of this approach include the possibility of changes in physical and biochemical properties in a physiologic setting (compared with the cell-free experimental setting) that alters protein binding at these domains. In addition, most PPIs in the current interactome were based on induced protein expression levels in experimental yeast cells, which may differ significantly from the endogenous environment where relevant proteins are typically expressed.
Incorporation of tissue- and disease-specific context to the current interactome by integrating existing PPIs with gene expression data and other methods is an area of active investigation.
A disease-associated isoform of lamin A for Hutchinson-Gilford progeria syndrome is a striking example of how its interactions with other proteins differ from those of a non–disease-associated isoform.
and their PPIs at a proteomic level is, therefore, much needed.
Lastly, representation of dynamic information in a network setting remains a major challenge, largely owing to the difficulty involved with system-wide assessment of reaction kinetic parameters and the mathematical and experimental challenges of capturing multivariate complexities of these reactions accurately. There is, therefore, an ongoing effort to create dynamic in situ (ie, intracellular) computational models for biochemical reactions.
The recently developed Dynamics-Agnostic Network Models was more robust to link removal than biochemical reaction models, and the authors contend it provides a comparable degree of predictive accuracy even with the current level of incompleteness of the interactome.
Future Directions
As we move toward the routine use of multi-omic data in preclinical and clinical settings, additional requirements should involve standardized, rigorous bioinformatics methods to process, normalize, and interpret the large-scale data sets obtained from across study modalities.
as well. Integration of psychosocial elements in defining disease networks would also be important for linking all contributing determinants of clinical outcomes.
Network controllability analyses continue to help identify key disease genes and prioritize potential novel therapeutic targets as exemplified by Sharma et al
Controllability in an islet specific regulatory network identifies the transcriptional factor NFATC4, which regulates type 2 diabetes associated genes.
New, exciting directions also include incorporation of time trajectories into the network analysis of chronic, progressive diseases using long-term simulation methods that infer gradual changes in parameters over time.
Reticulotype analysis is a newly emerging concept wherein a patient-specific collection of molecular mutants or variants is examined in the context of his or her unique integrative biological network (reticulome)
(Figure 6). No two individuals have identical biological networks. An individual's biological network context undoubtedly shapes the ultimate outcome (phenotype) of given set of molecular variants (genotype) and, therefore, should be an integral part of any patient-specific data analysis. Individualized reticulotype-based network analysis, therefore, promises to enhance ongoing genotype-phenotype correlation efforts and may facilitate the quest for individualized targeted therapies.
Figure 6Reticulotype analysis and individualized medicine. Patient-specific genotype-phenotype relationships can be assessed with greater precision by network-based reticulotype analysis.
Each individual's unique molecular perturbation findings are examined within the context of his or her unique integrative biological network (reticulome) derived from multi-omic studies. From Leopold JA, Loscalzo J: Emerging role of precision medicine in cardiovascular disease. Circ Res 2018, 122:1302–1315.
Network medicine represents a unique integrative path to accelerate our understanding of complex human diseases and to improve therapeutics with unprecedented breadth and precision.
Protein interaction analysis of senataxin and the ALS4 L389S mutant yields insights into senataxin post-translational modification and uncovers mutant-specific binding with a brain cytoplasmic RNA-encoded peptide.
Epigenomic and transcriptomic approaches in the post-genomic era: path to novel targets for diagnosis and therapy of the ischaemic heart? Position Paper of the European Society of Cardiology Working Group on Cellular Biology of the Heart.
Systems pharmacology and rational polypharmacy: nitric oxide-cyclic GMP signaling pathway as an illustrative example and derivation of the general case.
Controllability in an islet specific regulatory network identifies the transcriptional factor NFATC4, which regulates type 2 diabetes associated genes.
Supported by NIH grants HL061795 (J.L.), HL119145 (J.L.), HG007690 (J.L.), GM 107618, and T32 HL007604 (L.Y.-H.L.), and American Heart Association grant D700382 (J.L.).
Disclosures: J.L. is scientific cofounder of Scipher Medicine, Inc.