• A novel approach for global alignment for multiple biological networks
  • Authors   |   Summary   |   System Overview   |   Datasets   |   Results   |   Download

    • Authors:
      • Warith Eddine Djeddi

        Tel: +216 93 80 9 52 8
        Email: waritheddine@yahoo.fr

        LIPAH, Faculty of sciences of Tunis,
        University of Tunis ElManar,
        2092, Tunis, Tunisia
      • Sadok Ben Yahia

        Tel: + 216 71 872 200
        Email: sadok.benyahia@fst.rnu.tn

        LIPAH, Faculty of sciences of Tunis,
        University of Tunis ElManar,
        2092, Tunis, Tunisia
      • Engelbert Mephu Nguifo

        Tel: +33 4 73 40 76 29
        Email: mephu@isima.fr
        Web page: http://www.isima.fr/~mephu/

        LIMOS, UMR 6158
        Blaise Pascal University, BP 10125,
        63173, Clermont Ferrand, France
    • Summary:
    • Analyzing protein–protein interaction (PPI) networks, has been very effective in tackling many problems such as understanding the genetic factors that impact various diseases [1], drug discovery [2], predicting protein functions , identifying functional modules, and understanding the phylogeny from these data. Network alignment approaches can be generally classified into pairwise or multiple and into local or global approaches. Pairwise approaches align two networks and multiple approaches three and more networks. Local alignment approaches detect conserved subnetworks, rather than entire networks of two (pairwise local alignment) or more (multiple local alignment) networks. But, aligned regions can overlap, leading to ”‘ambiguous”’ many-to-many mappings. Thus, GNA was proposed. In this work, we present MAPPIN (Multiple Alignment for Protein Protein Interactions Networks) a global many-to-many alignment of multiple PPINs from different species. It is worth mentioning that our approach is based on NetCoffee [3]. However, there are a difference points between them which are depicted in Table 1.

      Table. 1: The main difference between MAPPIN and NetCoffee.
      MAPPIN NetCoffee
      It aligns two or more PPI networks. It aligns 3 networks or more, so it cannot align two networks.
      The topological similarity used for the detection of hubs and at phase of Seed Expansion. Topological similarity is based on the T-Coffee approach.
      It includes the functional similarity during the alignment process from the Gene Ontology Annotation (GOA) collected from UniProt-GOA [4]. It doesn't apply any functional similarities. The Gene Ontology, used after the process of the alignment in order to test the coherence of the alignments.
      It rigorously combines protein sequence similarity, network topology similarity and functional similarity (using GO) into a suitable scoring scheme for aligning k multiple networks. It rigorously combines protein sequence similarity and network topology similarity for aligning k multiple networks.

    • System overview:

    BLADYG

    MAPPIN System overview


    • Datasets:
    • We have performed an set of experiments to evaluate the effectiveness and efficiency of our approach on a number of different real datasets. proteins. We tested MAPPIN on five eukaryotic protein-protein interaction networks (PPINs): Homo sapiens (human), Mus musculus (mouse), Drosophila melanogaster (fly), Caenorhabditis elegans (worm) and Saccharomyces cerevisiae (Yeast). The number of proteins and interactions of these PPINs are given in Table 2.
      Table. 2: Experimental data.
      Species Proteins Interactions Dataset 1 Dataset 2 Dataset 3 Dataset 4
      H.sapiens 8777 28 366 « × » « × »
      M.musculus 1531 1626 « × » « × »
      D.melanogaster 1534 2664 « × » « × » « × » « × »
      C.elegans 767 915 « × » « × » « × » « × »
      S.cerevisiae 5739 36 226 « × » « × »
    • Data sources
    • The real dataset we tested is available from this link under the MAPPIN\dataset folder https://github.com/waritheddine/MAPPIN
    • Results:
    • We implemented our approach in C++ language using the LEMON Graph Library [5] version 1.3.1.
      All experiments were performed on a personal computer with a 3.40 GHz Intel i7 processor and 16GB memory. We used eight threads for each testing. For the three multiple network alignment algorithm we set the Alpha parameter to (0.3).


      • We demonstrate the quality of our alignments in terms of coverage (CV), Mean Entropy (ME) and Mean Normalized Entropy(MNE) and assess the performance of our method by measuring running times (Time).




      Dataset Measure MAPPIN NetCoffee IsoRank-N
      Dataset 1 CV(%)
      ME
      MNE
      Time
      57.1
      0.283
      0.206
      3m
      -
      -
      -
      -
      18.6
      1.235
      0.658
      22.5m
      Dataset 2 CV(%)
      ME
      MNE
      Time
      52.4
      0.286
      0.223
      4m
      28.2
      1.504
      0.6026
      3s
      16.1
      2.927
      0.9627
      26.5m
      Dataset 3 CV(%)
      ME
      MNE
      Time
      54.4
      0.342
      0.243
      9m
      41.2
      2.645
      0.8721
      26s
      31.1
      3.927
      1.173
      33.6m
      Dataset 4 CV(%)
      ME
      MNE
      Time
      67.5
      0.415
      0.281
      15m
      49.1
      2.288
      0.7988
      51.3s
      33.8
      3.597
      1.103
      3.12h
    • Download:
    • An implementation of our MAPPIN algorithm and Instructions describing how to setup it on Ubuntu 64bit are availablehere.

    • References:
    • [1] Yanhui Hu, Ian Flockhart, Arunachalam Vinayagam, Clemens Bergwitz, Bonnie Berger, Norbert Perrimon, and Stephanie E. Mohr. An integrative approach to ortholog prediction for disease-focused and other functional studies. BMC Bioinformatics, 12:357, 2011
      [2] Padmavati Sridhar, Tamer Kahveci, and Sanjay Ranka. An iterative algorithm for metabolic network-based drug target identification. In Biocomputing 2007, Proceedings of the Pacific Symposium, Maui, Hawaii, USA, 3-7 January 2007, pages 88–99, 2007.
      [3] Jialu Hu, Birte Kehr, Knut Reinert. NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks Bioinformatics 2014; 30 (4): 540-548. doi: 10.1093/bioinformatics/btt715
      [4] http://www.ebi.ac.uk/GOA
      [5] Balazs Dezso, Alpar Juttner, and Peter Kovacs. LEMON - an open source C++ graph template library. Electr. Notes Theor. Comput. Sci., 264(5):23–45, 2011.