Protein-DNA complexes benchmarks
HR-PDNA187 benchmark
The high resolution protein-DNA complexes benchmark (HR-PDNA187) comprises 187 structures of protein-(ds)DNA complexes non-redundant at 25% sequence identity. The complexes' 3D structures were downloaded from the Protein Data Bank (http://www.rcsb.org/) and are determined by X-ray crystallography with a resolution better than 2.5 Å and an R-factor lower than 0.3. Each complex comprises at least one protein chain longer than 40 amino acids and a DNA with at least 5 base pairs. The HR-PDNA187 benchmark covers all major groups of DNA-protein interactions according to Luscombe et al. classification (Luscombe NM, Austin SE, Berman HM, Thornton JM. An overview of the structures of protein-DNA complexes. Genome Biology. 2000;1(1):reviews001.1-reviews001.37.): helix-turn-helix (HTH), zinc-coordinating, zipper type, other α-helical, β-sheet, β-hairpin/ribbon, other. Moreover, it spans a wide range of different functional classes according to the Nucleic Acid Database (http://ndbserver.rutgers.edu/) classification: it comprises 100 enzymes, 78 regulatory proteins, 7 structural proteins, 1 protein with other function and 1 unclassified protein. Concerning the protein stoichiometry, the HR-PDNA187 comprises: 109 monomers, 64 homo-2-mers, 3 hetero-2-mers, 3 homo-3-mers, 1 hetero-3-mer, 5 homo-4-mers, 1 hetero-5-mer (composed of 4 couples of homo-2-mers) and 1 homo-6-mer.
HOLO-APO82 benchmark
This benchmark comprises the 82 HOLO(bound)-APO(unbound) pairs available in the Protein Data Bank (http://www.rcsb.org/) amongst the 187 protein-DNA complexes comprised in HR-PDNA187. The collected APO forms are all X-ray crystallographic structures that share at least 95% of sequence identity, a percentage of coverage ≥ 70% and a percentage of gaps ≤ 10% with the correspondent HOLO form. Two unbound forms in a different oligomeric state, a dimer and a monomer, are present for the complex 2ISZ. We retained both APO forms as they are reported in literature as both present in equilibrium (Chou, C. James, et al. "Functional studies of the Mycobacterium tuberculosis iron-dependent regulator." Journal of Biological Chemistry 279.51 (2004): 53554-53561.). Within each HOLO-APO pair, the APO form may be in the same oligomeric state as the HOLO form or may have fewer chains, due to the oligomerization process associated to DNA binding. The APO forms in the HOLO-APO82 dataset are divided in: 52 monomers, 28 homo-2-mers, 1 homo-3-mer and 1 hetero-3-mer. Specifically, we observed a change in the stoichiometry in 4 HOLO-APO pairs: 3 homo-4-mers, 1 homo-4-mer, 1 homo-3-mer, 1 homo-2-mer in the bound form are respectively 3 homo-2-mers and 3 monomers in the unbound form.
Download
The folder with the HR-PDNA187 and the HOLO-APO82 benchmarks are available here.
The Excel version of the table below is available here.
Table 1: List of the 187 complexes comprised in HR-PDNA187 dataset and the 82 HOLO-APO pairs.
The entries in the columns are respectively: 1) the Protein Data Bank (PDB) identifier of the HR-PDNA187 complex; 2) the protein chains considered in the complex; 3) the subset of protein chains considered in column 2 non redundant at 95% sequence identity; 4) the DNA chains considered in the complex; 5) the PDB identifier and the considered chains of the complex in the HOLO-APO82 dataset; 6) the PDB identifier and the considered chains of the unbound form in the HOLO-APO82 dataset; 7) the class and 8) the subclass to which the protein belongs as derived from the Nucleic Acid Database.
PDB ID | Prot chains | Prot nr chains | DNA chains | HOLO ID:chains | APO ID:chains | Class | Subclass |
---|---|---|---|---|---|---|---|
1a3q | AB | A | CD | regulatory | transcription_factor | ||
1a73 | AB | A | CDEF | 1a73:BA | 1evx:AB | enzyme | nuclease |
1b3t | AB | A | CD | 1b3t:BA | 1vhi:AB | regulatory | other |
1bdt | ABCD | A | EF | 1bdt:CD | 1myk:AB | regulatory | gene |
1bl0 | A | A | BC | regulatory | transcription_factor | ||
1cez | A | A | NT | enzyme | polymerase | ||
1d02 | AB | A | CD | enzyme | nuclease | ||
1dc1 | AB | A | CW | enzyme | nuclease | ||
1dfm | AB | A | CD | enzyme | nuclease | ||
1egw | AB | A | EF | regulatory | transcription_factor | ||
1emh | A | A | BC | 1emh:A | 3fci:A | enzyme | glycosylase |
1esg | AB | A | CD | enzyme | nuclease | ||
1f4k | AB | A | DE | 1f4k:BA | 2dqr:AB | regulatory | replication |
1fiu | ABCD | A | EFGHIJKL | enzyme | nuclease | ||
1gu4 | AB | A | CD | regulatory | transcription_factor | ||
1gxp | AB | A | CD | 1gxp:B | 1gxq:A | regulatory | other |
1h6f | AB | A | CD | regulatory | transcription_factor | ||
1hlv | A | A | BC | structural | other/centromere | ||
1i3j | A | A | BC | enzyme | nuclease | ||
1iaw | AB | A | CDEF | 1iaw:BA | 1ev7:AB | enzyme | hydrolase |
1j3e | A | A | BC | regulatory | replication | ||
1je8 | AB | A | CD | 1je8:B | 1a04:A | regulatory | transcription_factor |
1jko | C | C | AB | enzyme | recombinase | ||
1jx4 | A | A | PT | 1jx4:A | 2rdi:A | enzyme | polymerase |
1k3x | A | A | BC | 1k3x:A | 1q39:A | enzyme | nuclease |
1k4t | A | A | BCD | enzyme | isomerase | ||
1ku7 | A | A | BC | 1ku7:A | 1ku3:A | regulatory | transcription_factor |
1kx5 | ABCDEFGH | B | IJ | structural | histone | ||
D | |||||||
A | |||||||
C | |||||||
1l3l | BD | B | EG | regulatory | transcription_factor | ||
1lmb | 34 | 3 | 12 | regulatory | other | ||
1lq1 | CD | C | EF | regulatory | transcription_factor | ||
1mjo | ABCD | A | FG | 1mjo:AB | 1mjl:AB | regulatory | transcription_factor |
1mnn | A | A | BC | 1mnn:A | 1mn4:A | regulatory | transcription_factor |
1nkp | AB | B | FG | regulatory | transcription_factor | ||
A | |||||||
1oe4 | AB | A | EF | enzyme | glycosylase | ||
1orn | A | A | BC | enzyme | nuclease | ||
1oup | B | B | CD | 1oup:B | 1ouo:A | enzyme | nuclease |
1owf | AB | A | CDE | regulatory | transcription_factor | ||
B | |||||||
1ozj | A | A | CD | regulatory | transcription_factor | ||
1pp7 | U | U | EF | regulatory | transcription_factor | ||
1pt3 | A | A | CDEFGH | 1pt3:A | 3zfk:A | enzyme | nuclease |
1qna | A | A | CD | 1qna:A | 1vok:A | regulatory | transcription_factor |
1r71 | AB | A | EFIJ | regulatory | transcription_factor | ||
1rh6 | B | B | CD | regulatory | recombination | ||
1rxw | A | A | BC | enzyme | nuclease | ||
1sa3 | A | A | CD | enzyme | nuclease | ||
1skn | P | P | AB | regulatory | transcription_factor | ||
1sx5 | AB | A | CDEF | 1sx5:AB | 1az3:AB | enzyme | nuclease |
1sxq | A | A | CE | 1sxq:A | 1jg7:A | enzyme | transferase |
1t7p | A | A | PT | enzyme | polymerase | ||
1t9i | AB | A | CD | 1t9i:AB | 2o7m:AB | enzyme | nuclease |
1tc3 | C | C | AB | enzyme | other | ||
1tez | A | A | IJK | 1tez:A | 1owl:A | enzyme | lyase |
1u8b | A | A | BCDE | regulatory | other | ||
1uut | A | A | C | 1uut:A | 1m55:A | enzyme | nuclease |
1wb9 | AB | A | EF | regulatory | repair | ||
1xyi | A | A | BC | structural | chromosomal | ||
1yf3 | A | A | CD | 1yf3:A | 1q0s:A | enzyme | methyl |
1yo5 | C | C | AB | regulatory | transcription_factor | ||
1zme | CD | C | AB | regulatory | transcription_factor | ||
1zrf | AB | A | WXYZ | 1zrf:AB | 4r8h:AB | regulatory | other |
2aor | A | A | CD | enzyme | methyl | ||
2aq4 | A | A | PT | enzyme | transferase | ||
2bnw | ABCD | A | EFGH | 2bnw:CD | 1irq:AB | regulatory | other |
2dp6 | A | A | CD | 2dp6:A | 2d3y:A | enzyme | glycosylase |
2e52 | AB | A | EG | enzyme | nuclease | ||
2ex5 | AB | A | XY | enzyme | nuclease | ||
2fkc | A | A | CD | 2fkc:A | 1ynm:A | enzyme | nuclease |
2g1p | A | A | FG | 2g1p:A | 4gom:D | enzyme | methyl |
2gb7 | AB | A | EF | enzyme | nuclease | ||
2h27 | A | A | BC | enzyme | transferase | ||
2h7g | X | X | YZ | enzyme | isomerase | ||
2i06 | A | A | BC | regulatory | replication | ||
2ih2 | A | A | BC | 2ih2:A | 1aqj:B | enzyme | methyl |
2ihm | A | A | DPT | enzyme | polymerase | ||
2is6 | A | A | CD | 2is6:A | 3lfu:A | enzyme | helicase |
2isz | AB | A | EF | 2isz:BA | 2isy:AB | regulatory | transcription_factor |
2isz:B | 1b1b:A | ||||||
2noh | A | A | BC | 2noh:A | 5an4:A | enzyme | glycosylase |
2nq9 | A | A | BCD | 2nq9:A | 1qtw:A | enzyme | nuclease |
2o4a | A | A | BC | regulatory | transcription_factor | ||
2ofi | A | A | BC | 2ofi:A | 2ofk:A | enzyme | glycosylase |
2pi0 | B | B | EF | 2pi0:B | 3qu6:A | regulatory | other |
2pyj | A | A | XY | 2pyj:A | 1xhx:A | enzyme | polymerase |
2qhb | B | B | CD | 2qhb:B | 2ckx:A | structural | telomere |
2qoj | Z | Z | XY | enzyme | nuclease | ||
2r1j | LR | L | AB | regulatory | transcription_factor | ||
2r9l | A | A | CD | 2r9l:A | 2iru:A | enzyme | polymerase |
2rbf | AB | A | CD | 2rbf:BA | 2gpe:AB | regulatory | other |
2ve9 | ABC | A | IJ | 2ve9:A | 2ve8:A | structural | other |
2vla | A | A | LM | enzyme | nuclease | ||
2vs7 | A | A | BC | 2vs7:A | 1b24:A | enzyme | nuclease |
2w42 | A | A | PQ | 2w42:A | 1w9h:A | regulatory | other |
2w7n | AB | A | EFGH | 2w7n:BA | 5ckt:AD | regulatory | gene |
2xm3 | CD | C | KLMN | enzyme | other/transposase | ||
2xrz | A | A | CD | 2xrz:A | 2xry:A | enzyme | lyase |
2xzf | A | A | BC | enzyme | glycosylase | ||
2yvh | AB | A | EFGH | 2yvh:AB | 2yve:AB | regulatory | transcription_factor |
3aaf | A | A | CD | enzyme | other | ||
3bep | AB | A | CD | 3bep:BA | 4k3l:AB | enzyme | polymerase |
3bm3 | AB | A | CD | enzyme | nuclease | ||
3bs1 | A | A | BC | 3bs1:A | 4g4k:A | regulatory | gene |
3c0w | A | A | BCD | enzyme | nuclease | ||
3c25 | AB | A | CD | 3c25:AB | 3bvq:AB | enzyme | nuclease |
3coq | AB | A | DE | regulatory | transcription_factor | ||
3cw7 | ABCD | A | EFGH | 3cw7:B | 1mpg:A | enzyme | glycosylase |
3dsd | AB | A | C | 3dsd:BA | 1ii7:AB | regulatory | repair |
3dvo | AB | A | EF | enzyme | nuclease | ||
3eeo | A | A | CD | 3eeo:A | 1hmy:A | enzyme | methyl |
3f2b | A | A | PT | enzyme | polymerase | ||
3fde | A | A | DE | 3fde:A | 2zkg:A | enzyme | ligase |
3fdq | AB | A | CD | regulatory | other | ||
3g00 | A | A | HI | 3g00:A | 3g8v:A | enzyme | nuclease |
3g0q | A | A | BC | enzyme | hydrolase | ||
3g9m | AB | A | CD | regulatory | transcription_factor | ||
3gox | AB | A | CD | enzyme | nuclease | ||
3gxq | AB | A | CD | regulatory | other | ||
3h0d | AB | A | CD | regulatory | transcription_factor | ||
3i0w | A | A | BC | 3i0w:A | 3f10:A | enzyme | glycosylase |
3iag | C | C | AB | regulatory | transcription_factor | ||
3iay | A | A | PT | enzyme | polymerase | ||
3igm | AB | A | CDWX | regulatory | transcription_factor | ||
3ikt | AB | A | CD | 3ikt:AB | 3ikv:AB | regulatory | other |
3jso | AB | A | CD | 3jso:AB | 1jhf:AB | regulatory | other |
3jxy | A | A | BC | 3jxy:A | 3bvs:A | enzyme | glycosylase |
3k59 | A | A | PT | 3k59:A | 3k5o:A | enzyme | polymerase |
3kde | C | C | AB | enzyme | other | ||
3kxt | A | A | BC | structural | other | ||
3l2c | A | A | BC | regulatory | transcription_factor | ||
3lap | ABCDEF | A | GHIJKL | regulatory | other | ||
3m4a | A | A | DE | 3m4a:A | 2f4q:A | enzyme | isomerase |
3mfi | A | A | PT | 3mfi:A | 1jih:A | enzyme | polymerase |
3mln | AB | A | CD | regulatory | transcription_factor | ||
3mva | O | O | DE | regulatory | transcription_factor | ||
3mx4 | AH | A | KL | enzyme | nuclease | ||
3o1t | A | A | BC | 3o1t:A | 4jht:A | enzyme | other |
3o9x | AB | A | EF | 3o9x:AB | 3gn5:AB | regulatory | gene |
3od8 | A | A | IJ | enzyme | other | ||
3pov | A | A | CD | 3pov:A | 3fhd:A | enzyme | other |
3pvi | AB | A | CD | 3pvi:AB | 1k0z:AB | enzyme | nuclease |
3pvv | A | A | CD | regulatory | replication | ||
3qex | A | A | PT | 3qex:A | 3cfo:A | enzyme | polymerase |
3qmd | A | A | BC | regulatory | other | ||
3qqy | A | A | BC | enzyme | nuclease | ||
3qws | AB | A | CN | 3qws:AB | 2hin:AB | regulatory | other |
3rkq | A | A | CD | regulatory | transcription_factor | ||
3rmp | AC | A | EFGH | enzyme | other | ||
3s57 | A | A | BC | enzyme | other | ||
3s8q | AB | A | CD | 3s8q:BA | 4i6r:AB | regulatory | other |
3sjm | A | A | CD | structural | telomere | ||
3sm4 | ABC | A | DE | 3sm4:CAB | 1avq:ABC | enzyme | nuclease |
3spd | A | A | EF | 3spd:A | 3sp4:A | enzyme | hydrolase |
3ssc | A | A | CD | enzyme | nuclease | ||
3tan | A | A | BC | enzyme | polymerase | ||
3tq6 | A | A | CD | regulatory | transcription_factor | ||
3u2b | C | C | AB | regulatory | transcription_factor | ||
3vk8 | A | A | CD | 3vk8:A | 3a42:A | enzyme | glycosylase |
3vw3 | LH | L | AB | other | antibody | ||
H | |||||||
3vxv | A | A | BC | enzyme | hydrolase | ||
3zvk | FG | F | XY | regulatory | other | ||
3zvn | A | A | EFGHI | 3zvn:A | 3zvl:A | enzyme | hydrolase |
4aij | AB | A | CD | 4aij:BA | 4aih:AB | regulatory | transcription_factor |
4dih | H | H | D | 4dih:H | 3nxp:A | enzyme | thrombin |
4e9f | A | A | CD | 4e9f:A | 4e9e:A | enzyme | Hydrolase/glycosylase |
4ecq | A | A | PT | enzyme | polymerase | ||
4esj | A | A | CD | enzyme | nuclease | ||
4fzx | C | C | AB | enzyme | nuclease | ||
4g92 | ABC | B | DE | 4g92:ABC | 4g91:ABC | regulatory | transcription_factor |
C | |||||||
A | |||||||
4gck | AB | A | WZ | 4gck:AB | 4gfl:AB | other | other |
4gjr | AB | A | GHIJ | regulatory | transcription_factor | ||
4glx | A | A | BCD | 4glx:A | 5tt5:A | enzyme | ligase |
4gzn | C | C | AB | regulatory | transcription_factor | ||
4h0e | B | B | TU | regulatory | transcription_factor | ||
4h10 | AB | B | CD | regulatory | transcription_factor | ||
A | |||||||
4hf1 | AB | A | CD | 4hf1:AB | 4hf0:AB | regulatory | transcription_factor |
4hqe | AB | A | CD | 4hqe:BA | 4hqm:AB | regulatory | transcription_factor |
4htu | A | A | CD | 4htu:A | 1zbf:A | enzyme | nuclease |
4i2o | AB | A | XW | regulatory | other | ||
4ix7 | AB | A | CD | regulatory | other | ||
4j3n | AB | A | CDEF | enzyme | isomerase | ||
4jbm | A | A | RT | regulatory | other | ||
4jcy | AB | A | CD | 4jcy:BA | 3lis:AB | regulatory | other |
4k98 | A | A | DE | 4k98:A | 4k8v:C | enzyme | transferase |
4kb1 | A | A | C | enzyme | hydrolase | ||
4kli | A | A | DPT | enzyme | polymerase | ||
4kpy | A | A | CDN | NoClass | NoClass | ||
4qtj | A | A | BC | regulatory | transcription_factor | ||
4rkh | CEF | C | AB | enzyme | ligase | ||
6pax | A | A | BC | regulatory | transcription_factor |
Contact
For questions, comments or suggestions feel free to contact Flavia Corsi, Elodie Laine or Alessandra Carbone.