Phenotype | ID | Bacterium | Phylogenetic group | D10 |
---|---|---|---|---|
IRRB | B1 | Chroococcidiopsis thermalis PCC 7203 | Cyanobacteria | 4* |
B2 | Deinococcus deserti VCD115 | Deinococcus-Thermus | >7.5 | |
B3 | Deinococcus geothermalis DSM 11300 | Deinococcus-Thermus | 10-16 | |
B4 | Deinococcus gobiensis I 0 | Deinococcus-Thermus | 12.7 | |
B5 | Deinococcus maricopensis DSM 21211 | Deinococcus-Thermus | ~11 | |
B6 | Deinococcus proteolyticus MRP | Deinococcus-Thermus | >15 | |
B7 | Deinococcus radiodurans R1 | Deinococcus-Thermus | 10 | |
B8 | Geodermatophilus obscurus DSM 43160 | Actinobacteria | 9 | |
B9 | Kineococcus radiotolerans SRS30216 | Actinobacteria | 2 | |
B10 | Kocuria rhizophila DC2201 | Actinobacteria | 2** | |
B11 | Methylobacterium radiotolerans JCM 2831 | Proteobacteria | 1 | |
B12 | Modestobacter marinus | Actinobacteria | 6 | |
B13 | Rubrobacter xylanophilus DSM 9941 | Actinobacteria | 5.5 | |
B14 | Truepera radiovictrix DSM 17093 | Deinococcus-Thermus | >5 | |
IRSB | B15 | Brucella abortus S19 | Proteobacteria | 0.34 |
B16 | Escherichia coli B REL606 | Proteobacteria | 0.7 | |
B17 | Escherichia coli str. K-12 substr. DH10B | Proteobacteria | 0.7 | |
B18 | Neisseria gonorrhoeae FA 1090 | Proteobacteria | 0.07-0.125 | |
B19 | Neisseria gonorrhoeae TCDC NG08107 | Proteobacteria | 0.07-0.125 | |
B20 | Pseudomonas putida S16 | Proteobacteria | 0.25 | |
B21 | Shewanella oneidensis MR-1 | Proteobacteria | 0.07 | |
B22 | Shigella dysenteriae1617 | Proteobacteria | 0.22 | |
B23 | Thermus thermophilus HB27 | Deinococcus-Thermus | 0.8 | |
B24 | Thermus thermophilus HB8 | Deinococcus-Thermus | 0.8*** | |
B25 | Thermus thermophilus JL-18 | Deinococcus-Thermus | 0.8*** | |
B26 | Thermus thermophilus SG0.5JP17-16 | Deinococcus-Thermus | 0.8*** | |
B27 | Vibrio parahaemolyticus RIMD 2210633 | Proteobacteria | 0.03-0.06 | |
B28 | Yersinia enterocolitica 8081 | Proteobacteria | 0.1-0.21 |
*for Chroococcidiopsis spp **for Kocuria rosea ***for T. thermophilus HB27
ID | Protein | Function |
---|---|---|
P1 | Hypothetical DNA polymerase | DNA polymerase |
P2 | DNA polymerase III, α subunit | |
P3 | DNA-directed DNA polymerase | |
P4 | DNA polymerase III, τ/γ subunit | |
P5 | Single-stranded DNA-binding protein | Replication complex |
P6 | Replicative DNA helicase | |
P7 | DNA primase | |
P8 | DNA gyrase, subunit B | |
P9 | DNA topoisomerase I | |
P10 | DNA gyrase subunit A | |
P11 | smf protein | Other DNA-associated proteins |
P12 | Endonuclease III | |
P13 | Holliday junction resolvase | |
P14 | Formamidopyrimidine-DNA glycosylase | |
P15 | Holliday junction DNA helicase | |
P16 | RecF protein | |
P17 | DNA repair protein radA | |
P18 | Holliday junction binding protein | |
P19 | Excinuclease ABC, subunit C | |
P20 | DNA repair protein RecN | |
P21 | Transcription-repair coupling factor | |
P22 | Excinuclease ABC, subunit A | |
P23 | DNA helicase II | |
P24 | DNA helicase RecG | |
P25 | Exonuclease SbcD, putative | |
P26 | Exonuclease SbcC | |
P27 | Ribonuclease HII | |
P28 | Excinuclease ABC, subunit B | |
P29 | A/G-specific adenine glycosylase | |
P30 | RecA protein | |
P31 | DNA-3-methyladenine glycosidase II, putative |
The computations were carried out on a i7 CPU 2.49 GHz PC with 6 GB of memory, operating on Linux Ubuntu. In the classification process, we used the Leave-One-Out (LOO) evaluation technique.
Used proteins | Aggregation method | Accuracy (%) | Sensitivity (%) | Specificity (%) |
---|---|---|---|---|
All proteins | SMS | 92.80 | 92.80 | 92.80 |
WAMS | 89.2 | 92.30 | 86.6 | |
DNA Polymerase proteins | SMS | 89.2 | 92.30 | 86.6 |
WAMS | 89.2 | 92.30 | 86.6 | |
Replication complex proteins | SMS | 92.80 | 92.80 | 92.80 |
WAMS | 92.80 | 92.80 | 92.80 |
|
Other DNA-associated proteins | SMS | 92.80 | 92.80 | 92.80 |
WAMS | 92.80 | 92.80 | 92.80 |
Phenotype | Bacterium ID | Successful predictions (%) |
---|---|---|
IRRB | B1 | 100 |
B2 | 100 | |
B3 | 100 | |
B4 | 100 | |
B5 | 100 | |
B6 | 100 | |
B7 | 100 | |
B8 | 100 | |
B9 | 100 | |
B10 | 100 | |
B11 | 0 | |
B12 | 100 | |
B13 | 100 | |
B14 | 62.5* | |
IRSB | B15 | 0 |
B16 | 100 | |
B17 | 100 | |
B18 | 100 | |
B19 | 100 | |
B20 | 100 | |
B21 | 100 | |
B22 | 100 | |
B23 | 100 | |
B24 | 100 | |
B25 | 100 | |
B26 | 100 | |
B27 | 100 | |
B28 | 100 |
*successfully classified bacterium using: (1) all proteins with SMS aggregation method (2) replication complex proteins with SMS and WAMS aggregation methods and (3) other DNAassociated proteins with SMS and WAMS aggregation methods.
Protein ID | Accuracy (%) | Sensitivity (%) | Specificity (%) |
---|---|---|---|
P1 | 85.7 | 100 | 77.7 |
P2 | 89.2 | 92.3 | 86.6 |
P3 | 82.1 | 90.9 | 76.4 |
P4 | 89.2 | 92.3 | 86.6 |
P5 | 89.2 | 92.3 | 86.6 |
P6 | 89.2 | 92.3 | 86.6 |
P7 | 89.2 | 92.3 | 86.6 |
P8 | 78.5 | 83.3 | 75 |
P9 | 89.2 | 92.3 | 86.6 |
P10 | 89.2 | 92.3 | 86.6 |
P11 | 89.2 | 92.3 | 86.6 |
P12 | 89.2 | 92.3 | 86.6 |
P13 | 78.5 | 90 | 72.2 |
P14 | 89.2 | 92.3 | 86.6 |
P15 | 85.7 | 91.6 | 81.2 |
P16 | 89.2 | 92.3 | 86.6 |
P17 | 85.7 | 91.6 | 81.2 |
P18 | 85.7 | 91.6 | 81.2 |
P19 | 89.2 | 92.3 | 86.6 | P20 | 85.7 | 91.6 | 81.2 |
P21 | 85.7 | 91.6 | 81.2 |
P22 | 89.2 | 92.3 | 86.6 |
P23 | 89.2 | 92.3 | 86.6 |
P24 | 89.2 | 92.3 | 86.6 |
P25 | 85.7 | 91.6 | 81.2 |
P26 | 82.1 | 90.9 | 76.4 |
P27 | 82.1 | 100 | 73.6 |
P28 | 89.2 | 92.3 | 86.6 |
P29 | 78.5 | 90 | 72.2 |
P30 | 89.2 | 92.3 | 86.6 |
P31 | 78.5 | 78.5 | 78.5 |