You are here

Structural descriptor database: a new tool for sequence-based functional site prediction.

TitleStructural descriptor database: a new tool for sequence-based functional site prediction.
Publication TypeJournal Article
Year of Publication2008
AuthorsBernardes, JS, Fernandez, JH, Vasconcelos, ATereza R
JournalBMC Bioinformatics
Date Published2008 Nov 25
KeywordsAlgorithms, Artificial Intelligence, Binding Sites, Catalytic Domain, Databases, Protein, Information Storage and Retrieval, Internet, Pattern Recognition, Automated, Protein Conformation, Proteins, Proteomics, Sequence Analysis, Protein, Software, Structure-Activity Relationship

BACKGROUND: The Structural Descriptor Database (SDDB) is a web-based tool that predicts the function of proteins and functional site positions based on the structural properties of related protein families. Structural alignments and functional residues of a known protein set (defined as the training set) are used to build special Hidden Markov Models (HMM) called HMM descriptors. SDDB uses previously calculated and stored HMM descriptors for predicting active sites, binding residues, and protein function. The database integrates biologically relevant data filtered from several databases such as PDB, PDBSUM, CSA and SCOP. It accepts queries in fasta format and predicts functional residue positions, protein-ligand interactions, and protein function, based on the SCOP database.

RESULTS: To assess the SDDB performance, we used different data sets. The Trypsion-like Serine protease data set assessed how well SDDB predicts functional sites when curated data is available. The SCOP family data set was used to analyze SDDB performance by using training data extracted from PDBSUM (binding sites) and from CSA (active sites). The ATP-binding experiment was used to compare our approach with the most current method. For all evaluations, significant improvements were obtained with SDDB.

CONCLUSION: SDDB performed better when trusty training data was available. SDDB worked better in predicting active sites rather than binding sites because the former are more conserved than the latter. Nevertheless, by using our prediction method we obtained results with precision above 70%.

Alternate JournalBMC Bioinformatics
PubMed ID19032768
PubMed Central IDPMC2612011

Open Positions