Send to

Choose Destination
J Theor Biol. 2012 Jan 21;293:174-88. doi: 10.1016/j.jtbi.2011.10.016. Epub 2011 Oct 25.

New Markov-Shannon Entropy models to assess connectivity quality in complex networks: from molecular to cellular pathway, Parasite-Host, Neural, Industry, and Legal-Social networks.

Author information

Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain.


Graph and Complex Network theory is expanding its application to different levels of matter organization such as molecular, biological, technological, and social networks. A network is a set of items, usually called nodes, with connections between them, which are called links or edges. There are many different experimental and/or theoretical methods to assign node-node links depending on the type of network we want to construct. Unfortunately, the use of a method for experimental reevaluation of the entire network is very expensive in terms of time and resources; thus the development of cheaper theoretical methods is of major importance. In addition, different methods to link nodes in the same type of network are not totally accurate in such a way that they do not always coincide. In this sense, the development of computational methods useful to evaluate connectivity quality in complex networks (a posteriori of network assemble) is a goal of major interest. In this work, we report for the first time a new method to calculate numerical quality scores S(L(ij)) for network links L(ij) (connectivity) based on the Markov-Shannon Entropy indices of order k-th (θ(k)) for network nodes. The algorithm may be summarized as follows: (i) first, the θ(k)(j) values are calculated for all j-th nodes in a complex network already constructed; (ii) A Linear Discriminant Analysis (LDA) is used to seek a linear equation that discriminates connected or linked (L(ij)=1) pairs of nodes experimentally confirmed from non-linked ones (L(ij)=0); (iii) the new model is validated with external series of pairs of nodes; (iv) the equation obtained is used to re-evaluate the connectivity quality of the network, connecting/disconnecting nodes based on the quality scores calculated with the new connectivity function. This method was used to study different types of large networks. The linear models obtained produced the following results in terms of overall accuracy for network reconstruction: Metabolic networks (72.3%), Parasite-Host networks (93.3%), CoCoMac brain cortex co-activation network (89.6%), NW Spain fasciolosis spreading network (97.2%), Spanish financial law network (89.9%) and World trade network for Intelligent & Active Food Packaging (92.8%). In order to seek these models, we studied an average of 55,388 pairs of nodes in each model and a total of 332,326 pairs of nodes in all models. Finally, this method was used to solve a more complicated problem. A model was developed to score the connectivity quality in the Drug-Target network of US FDA approved drugs. In this last model the θ(k) values were calculated for three types of molecular networks representing different levels of organization: drug molecular graphs (atom-atom bonds), protein residue networks (amino acid interactions), and drug-target network (compound-protein binding). The overall accuracy of this model was 76.3%. This work opens a new door to the computational reevaluation of network connectivity quality (collation) for complex systems in molecular, biomedical, technological, and legal-social sciences as well as in world trade and industry.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center