2.1 Data retrieval
The DrugBank database (version 5.1.8) [9] is a comprehensive drug database that provides information about drugs and their associated gene targets, indications and pathways. Other relevant information such as the chemical, pharmaceutical, mechanism of action, pharmacodynamics, toxicity, clinical trials and absorption, distribution, metabolism, excretion and toxicity (ADMET) properties of drugs are also listed.
To identify drugs related to FTD, "frontotemporal dementia" was used as a search term. For each drug, the information about its Food and Drug Administration (FDA) status, number of targets, target genes, Universal Protein Resource (UniProt) ID of the target genes and DrugBank ID was retrieved.
NeuroDNet [10] an open-source platform that hosts a collection of disease models was used to collect the list of susceptible genes, their UniProt IDs and associated disorders linked to these genes.
2.2 Functional enrichment analysis
Functional enrichment analysis is a method used to identify pathways of statistical significance. This method recognizes the overrepresented genes and the sub-systems associated with it. This gives us a biological insight into the intersecting pathways that could be affected when a drug is administered. WebGestalt, a Web-based Gene Set Analysis Toolkit [11] was used to interpret and analyse the significantly enriched molecular pathways associated with the target genes.
To perform enrichment analysis, the following steps were followed:
•Homo sapiens, overrepresentation analysis (ORA), pathway and Kyoto Encyclopedia of Genes and Genomes (KEGG) were chosen as the organism of interest, method of enrichment analysis, functional database category and functional database name, respectively. Gene symbol was chosen as the gene ID type, and the list of target genes retrieved from DrugBank was imported into WebGestalt in Text (tab-delimited) format.
•Protein coding genome was selected as the reference gene set for enrichment analysis.
•The Benjamini–Hochberg (BH) procedure was used to control the false discovery rate (FDR) at 0.05.
The enrichment results were displayed in the form of a bar chart (default) and a table in the decreasing order of enrichment ratio. The p-value and FDR values were set to 0.05 as a threshold for significance. The volcano plot is a −log10 FDR v/s + log2 enrichment value plot that can be downloaded to visualize the significantly enriched gene sets. Each gene set from the table was then downloaded in the form of a.csv file for further analysis.
2.3 Protein–protein interaction using STRING
To analyse the biological interaction between the target genes and the disease susceptibility genes, protein–protein interaction (PPI) network was constructed using the STRING (Search Tool for the Retrieval of Interacting Genes) v11 database. STRING is a biological database consisting of physical and functional protein–protein interactions derived from public text collections, experimental data, co-expression and genomic context predictions. A text file containing the gene targets (DrugBank) and the disease susceptibility genes (NeuroDNet) was imported into STRING [12].
The network was constructed based on the interaction data provided from text mining, experiments, databases, co-expression, neighbourhood, gene fusion and co-occurrence. The target genes of the drugs represent the nodes and the interactions between them represent the edges in the PPI network. STRING also provides the functional enrichments in a network that include a list of biological processes, molecular functions, cellular components, protein domains, etc. Each function is also provided with a count in the network, strength of interaction and the FDR. The Network Stats provided by STRING consists of the number of nodes, number of edges, average node degree, average local clustering coefficient, expected number of edges and the PPI enrichment p-value. A very low p-value indicates statistical significance. The biological significance of the network can be assessed by comparing the observed number of edges and the expected number of edges with a p value of 0.05 as threshold. If the observed edge count is way higher than the expected edge count, it indicates that the proteins are biologically connected as a group and do not belong to a random set of proteins of similar size.
The interaction network obtained was updated by removing disconnected nodes and by increasing the confidence score to 0.700 (high confidence) to reduce the number of false positives. The resulting network was imported into Cytoscape [13] and was analysed based on the node degree and betweenness centrality values using the network analyser plug-in of Cytoscape.
A high node degree represents a greater interaction with other nodes, and a high betweenness centrality (based on shortest paths) indicates a better reach and connection within the network. These nodes can be classified as hub nodes that form bridges between clusters in a network. These hub nodes play an important role in network architecture and targeting them can result in the impairment of the entire network.
2.4 Drug–gene network analysis and visualization using Cytoscape
Cytoscape 3.8.2 is an open-source, publicly available tool written in Java that is used for network representation and analysis.
To visualize the drug–target interactions for the enriched categories, a Python script was written to read the.csv files downloaded for each enriched gene set from WebGestalt as mentioned earlier. The script mapped the UniProt IDs from each gene set with the respective target drugs from DrugBank as columns into individual.csv files. Each resultant.csv file was imported as networks into Cytoscape. The genes were chosen as source nodes and the drugs as target nodes. The edges correspond to the interaction between the nodes. The individual networks were merged and analysed based on the node degree and betweenness centrality. The "Style" option was used to change the shape of the drugs and gene targets to ellipse and round rectangle, respectively. The node colour was changed to match the node degree based on continuous mapping. The darker-coloured nodes represented a higher node degree while the lighter-coloured nodes represented a lower node degree.
2.5 Historeceptomics approach
To further evaluate the region of action of multi-target drugs and multi-drug combinations, a historeceptomics analysis was performed. Historeceptomics is a bioinformatics method that integrates tissue specificity with drug–target data.
One can narrow down the common tissues targeted by the FTD drugs and make an educated guess about possible drug–drug interaction at the tissue level using the historeceptomics profiler. Historeceptomics profiler [14] is a tool that assigns a historeceptomics (HR) score to each target–tissue pair of the input drug and ranks them based on the drug's likeness to elicit a phenotype in that tissue. The top five drugs with the highest node degrees were fed as input to the tool. The output table lists the possible targets (proteins/drug receptors) for the specific drug, corresponding tissues displaying the drug activity, target gene, its UniProt ID and the source of the drug (ChEMBL/DrugBank). The score section of the table consists of the Z-score (observed gene expression value in said tissue when compared with the mean of the gene expression values in other tissues), intensity (raw gene expression value), HR score (an amalgamation of drug–target affinity and tissue expression values, a higher score indicates the higher likeness of the drug to bind to that tissue) and the p value of the HR score. The common target tissues of the five drugs were tabulated and taken forward for further analysis.
2.6 Drug–drug association network
To analyse the adverse drug–drug interactions among the FTD drugs, drug interactions for each drug were collected from the Drug Interaction Lookup tool by DrugBank.
A Python script was written to extract only the interactions with the FTD drugs present in our drug list. The results were used to construct a drug–drug interaction network in Cytoscape. The drugs depicted as nodes while the edges depicted the adverse drug–drug interactions. The colours of the nodes were changed to reflect the node degree. The darker-coloured nodes represented more interactions in the network while the lighter-coloured nodes represented fewer interactions in the network.