| Phenotype | ID | Bacterium | Phylogenetic group | D10 |
|---|---|---|---|---|
| IRRB | B1 | Chroococcidiopsis thermalis PCC 7203 | Cyanobacteria | 4* |
| B2 | Deinococcus deserti VCD115 | Deinococcus-Thermus | >7.5 | |
| B3 | Deinococcus geothermalis DSM 11300 | Deinococcus-Thermus | 10-16 | |
| B4 | Deinococcus gobiensis I 0 | Deinococcus-Thermus | 12.7 | |
| B5 | Deinococcus maricopensis DSM 21211 | Deinococcus-Thermus | ~11 | |
| B6 | Deinococcus proteolyticus MRP | Deinococcus-Thermus | >15 | |
| B7 | Deinococcus radiodurans R1 | Deinococcus-Thermus | 10 | |
| B8 | Geodermatophilus obscurus DSM 43160 | Actinobacteria | 9 | |
| B9 | Kineococcus radiotolerans SRS30216 | Actinobacteria | 2 | |
| B10 | Kocuria rhizophila DC2201 | Actinobacteria | 2** | |
| B11 | Methylobacterium radiotolerans JCM 2831 | Proteobacteria | 1 | |
| B12 | Modestobacter marinus | Actinobacteria | 6 | |
| B13 | Rubrobacter xylanophilus DSM 9941 | Actinobacteria | 5.5 | |
| B14 | Truepera radiovictrix DSM 17093 | Deinococcus-Thermus | >5 | |
| IRSB | B15 | Brucella abortus S19 | Proteobacteria | 0.34 |
| B16 | Escherichia coli B REL606 | Proteobacteria | 0.7 | |
| B17 | Escherichia coli str. K-12 substr. DH10B | Proteobacteria | 0.7 | |
| B18 | Neisseria gonorrhoeae FA 1090 | Proteobacteria | 0.07-0.125 | |
| B19 | Neisseria gonorrhoeae TCDC NG08107 | Proteobacteria | 0.07-0.125 | |
| B20 | Pseudomonas putida S16 | Proteobacteria | 0.25 | |
| B21 | Shewanella oneidensis MR-1 | Proteobacteria | 0.07 | |
| B22 | Shigella dysenteriae1617 | Proteobacteria | 0.22 | |
| B23 | Thermus thermophilus HB27 | Deinococcus-Thermus | 0.8 | |
| B24 | Thermus thermophilus HB8 | Deinococcus-Thermus | 0.8*** | |
| B25 | Thermus thermophilus JL-18 | Deinococcus-Thermus | 0.8*** | |
| B26 | Thermus thermophilus SG0.5JP17-16 | Deinococcus-Thermus | 0.8*** | |
| B27 | Vibrio parahaemolyticus RIMD 2210633 | Proteobacteria | 0.03-0.06 | |
| B28 | Yersinia enterocolitica 8081 | Proteobacteria | 0.1-0.21 |
*for Chroococcidiopsis spp **for Kocuria rosea ***for T. thermophilus HB27
| ID | Protein | Function |
|---|---|---|
| P1 | Hypothetical DNA polymerase | DNA polymerase |
| P2 | DNA polymerase III, α subunit | |
| P3 | DNA-directed DNA polymerase | |
| P4 | DNA polymerase III, τ/γ subunit | |
| P5 | Single-stranded DNA-binding protein | Replication complex |
| P6 | Replicative DNA helicase | |
| P7 | DNA primase | |
| P8 | DNA gyrase, subunit B | |
| P9 | DNA topoisomerase I | |
| P10 | DNA gyrase subunit A | |
| P11 | smf protein | Other DNA-associated proteins |
| P12 | Endonuclease III | |
| P13 | Holliday junction resolvase | |
| P14 | Formamidopyrimidine-DNA glycosylase | |
| P15 | Holliday junction DNA helicase | |
| P16 | RecF protein | |
| P17 | DNA repair protein radA | |
| P18 | Holliday junction binding protein | |
| P19 | Excinuclease ABC, subunit C | |
| P20 | DNA repair protein RecN | |
| P21 | Transcription-repair coupling factor | |
| P22 | Excinuclease ABC, subunit A | |
| P23 | DNA helicase II | |
| P24 | DNA helicase RecG | |
| P25 | Exonuclease SbcD, putative | |
| P26 | Exonuclease SbcC | |
| P27 | Ribonuclease HII | |
| P28 | Excinuclease ABC, subunit B | |
| P29 | A/G-specific adenine glycosylase | |
| P30 | RecA protein | |
| P31 | DNA-3-methyladenine glycosidase II, putative |
The computations were carried out on a i7 CPU 2.49 GHz PC with 6 GB of memory, operating on Linux Ubuntu. In the classification process, we used the Leave-One-Out (LOO) evaluation technique.
| Used proteins | Aggregation method | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|
| All proteins | SMS | 92.80 | 92.80 | 92.80 |
| WAMS | 89.2 | 92.30 | 86.6 | |
| DNA Polymerase proteins | SMS | 89.2 | 92.30 | 86.6 |
| WAMS | 89.2 | 92.30 | 86.6 | |
| Replication complex proteins | SMS | 92.80 | 92.80 | 92.80 |
| WAMS | 92.80 | 92.80 | 92.80 |
|
| Other DNA-associated proteins | SMS | 92.80 | 92.80 | 92.80 |
| WAMS | 92.80 | 92.80 | 92.80 |
| Phenotype | Bacterium ID | Successful predictions (%) |
|---|---|---|
| IRRB | B1 | 100 |
| B2 | 100 | |
| B3 | 100 | |
| B4 | 100 | |
| B5 | 100 | |
| B6 | 100 | |
| B7 | 100 | |
| B8 | 100 | |
| B9 | 100 | |
| B10 | 100 | |
| B11 | 0 | |
| B12 | 100 | |
| B13 | 100 | |
| B14 | 62.5* | |
| IRSB | B15 | 0 |
| B16 | 100 | |
| B17 | 100 | |
| B18 | 100 | |
| B19 | 100 | |
| B20 | 100 | |
| B21 | 100 | |
| B22 | 100 | |
| B23 | 100 | |
| B24 | 100 | |
| B25 | 100 | |
| B26 | 100 | |
| B27 | 100 | |
| B28 | 100 |
*successfully classified bacterium using: (1) all proteins with SMS aggregation method (2) replication complex proteins with SMS and WAMS aggregation methods and (3) other DNAassociated proteins with SMS and WAMS aggregation methods.
| Protein ID | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|
| P1 | 85.7 | 100 | 77.7 |
| P2 | 89.2 | 92.3 | 86.6 |
| P3 | 82.1 | 90.9 | 76.4 |
| P4 | 89.2 | 92.3 | 86.6 |
| P5 | 89.2 | 92.3 | 86.6 |
| P6 | 89.2 | 92.3 | 86.6 |
| P7 | 89.2 | 92.3 | 86.6 |
| P8 | 78.5 | 83.3 | 75 |
| P9 | 89.2 | 92.3 | 86.6 |
| P10 | 89.2 | 92.3 | 86.6 |
| P11 | 89.2 | 92.3 | 86.6 |
| P12 | 89.2 | 92.3 | 86.6 |
| P13 | 78.5 | 90 | 72.2 |
| P14 | 89.2 | 92.3 | 86.6 |
| P15 | 85.7 | 91.6 | 81.2 |
| P16 | 89.2 | 92.3 | 86.6 |
| P17 | 85.7 | 91.6 | 81.2 |
| P18 | 85.7 | 91.6 | 81.2 |
| P19 | 89.2 | 92.3 | 86.6 | P20 | 85.7 | 91.6 | 81.2 |
| P21 | 85.7 | 91.6 | 81.2 |
| P22 | 89.2 | 92.3 | 86.6 |
| P23 | 89.2 | 92.3 | 86.6 |
| P24 | 89.2 | 92.3 | 86.6 |
| P25 | 85.7 | 91.6 | 81.2 |
| P26 | 82.1 | 90.9 | 76.4 |
| P27 | 82.1 | 100 | 73.6 |
| P28 | 89.2 | 92.3 | 86.6 |
| P29 | 78.5 | 90 | 72.2 |
| P30 | 89.2 | 92.3 | 86.6 |
| P31 | 78.5 | 78.5 | 78.5 |