Analyzing protein–protein interaction (PPI) networks, has been very effective in tackling many problems such as understanding the genetic factors that impact various diseases [1], drug discovery [2], predicting protein functions , identifying functional modules, and understanding the phylogeny from these data. Network alignment approaches can be generally classified into pairwise or multiple and into local or global approaches. Pairwise approaches align two networks and multiple approaches three and more networks. Local alignment approaches detect conserved subnetworks, rather than entire networks of two (pairwise local alignment) or more (multiple local alignment) networks. But, aligned regions can overlap, leading to ”‘ambiguous”’ many-to-many mappings. Thus, GNA was proposed.
In this work, we present MAPPIN (Multiple Alignment for Protein Protein Interactions Networks) a global many-to-many alignment of multiple PPINs from different species. It is worth mentioning that our approach is based on NetCoffee [3]. However, there are a difference points between them which are depicted in
Table 1.
| MAPPIN | NetCoffee |
|---|---|
| It aligns two or more PPI networks. | It aligns 3 networks or more, so it cannot align two networks. |
| The topological similarity used for the detection of hubs and at phase of Seed Expansion. | Topological similarity is based on the T-Coffee approach. |
| It includes the functional similarity during the alignment process from the Gene Ontology Annotation (GOA) collected from UniProt-GOA [4]. | It doesn't apply any functional similarities. The Gene Ontology, used after the process of the alignment in order to test the coherence of the alignments. |
| It rigorously combines protein sequence similarity, network topology similarity and functional similarity (using GO) into a suitable scoring scheme for aligning k multiple networks. | It rigorously combines protein sequence similarity and network topology similarity for aligning k multiple networks. |
MAPPIN System overview
| Species | Proteins | Interactions | Dataset 1 | Dataset 2 | Dataset 3 | Dataset 4 |
|---|---|---|---|---|---|---|
| H.sapiens | 8777 | 28 366 | « × » | « × » | ||
| M.musculus | 1531 | 1626 | « × » | « × » | ||
| D.melanogaster | 1534 | 2664 | « × » | « × » | « × » | « × » |
| C.elegans | 767 | 915 | « × » | « × » | « × » | « × » |
| S.cerevisiae | 5739 | 36 226 | « × » | « × » |
We implemented our approach in C++ language using the LEMON Graph Library [5] version 1.3.1.
All experiments were performed on a personal computer with a 3.40 GHz Intel i7 processor and 16GB
memory. We used eight threads for each testing. For the three multiple network alignment algorithm we set the Alpha parameter to (0.3).
| Dataset | Measure | MAPPIN | NetCoffee | IsoRank-N |
|---|---|---|---|---|
| Dataset 1 | CV(%) ME MNE Time | 57.1 0.283 0.206 3m | - - - - | 18.6 1.235 0.658 22.5m |
| Dataset 2 | CV(%) ME MNE Time | 52.4 0.286 0.223 4m | 28.2 1.504 0.6026 3s | 16.1 2.927 0.9627 26.5m |
| Dataset 3 | CV(%) ME MNE Time | 54.4 0.342 0.243 9m | 41.2 2.645 0.8721 26s | 31.1 3.927 1.173 33.6m |
| Dataset 4 | CV(%) ME MNE Time | 67.5 0.415 0.281 15m | 49.1 2.288 0.7988 51.3s | 33.8 3.597 1.103 3.12h |