Identifying similar diseases could potentially provide deeper understanding of their underlying causes and may even hint at possible treatments, as similar diseases might have similar drug targets. For this purpose, it is necessary to have a disease-disease similarity measure that reflects the underlying molecular interactions and biological pathways. DeCoaD is a web-based application for finding and clustering similar diseases.

DeCoaD, which combines protein-protein interaction and gene-disease association data, uses the flow of information in a disease-protein network to calculate the “correlation” between any two given diseases that have gene associations. In such a network the diseases are connected to the proteins encoded by the genes known to be associated with them. The proteins, on the other hand are linked if they are known to interact. The information flow is modeled by a random walk that starts from and ends at a disease. For a given disease in the network the method assigns a weight to each protein in the network, which is equal to the number of times that the random walker is expected to visit the protein. The correlation between a pair of diseases is then calculated as the cosine similarity between their corresponding weight vectors. It is worth noting, however, that the correlation defined here is not necessarily a measure of phenotypic similarity between two diseases, but it is a measure of similarity of the involved genes or biological processes.

On the main web page of DeCoaD, the user is asked to choose between two options: (1) see a list of diseases similar to a particular disease; (2) see a list of pairs of significantly similar diseases. Choosing option (1) opens a new page in which the user enters the name of the disease of interest and some other parameters. DeCoaD then reports the correlations between the user-specified disease and others present in the disease network. The input disease, however, does not have to be in the network. In other words, the user has the option to add a new disease to the network. DeCoaD also uses a probabilistic clustering algorithm that produces overlapping clusters of diseases. The program reports the membership probabilities of the disease of interest in different clusters. Each clusters is assigned a center, represented by a vector that is defined as the weighted average of the weight vectors of all diseases (for details please see the reference). The weights associated with the diseases or the cluster centers can be used to perform enrichment analysis. For this purpose DeCoaD provides an interface to SaddleSum. If, on the main page of DeCoaD, option (2) is chosen, the program will report a list of disease pairs that rank higher than a cutoff provided by the user.

Disclaimer | Privacy
Maintained by Mehdi Bagheri Hamaneh.