KFC/STBI Strukturní bioinformatika 04_databáze Karel Berka
Databáze není jich málo htttp://www.rcsb.org/pdb/
Primární strukturní databáze PDBe: Protein Data Bank in Europe doplnění PDB z BMRB (NMR) a EMDB (EM) PDBsum: shromažďuje další informace o struktuře PDBwiki: A community annotated knowledge base of biological molecular structures wikipedia o PDB strukturách NDB: Nucleic Acid Structure Database databáze Nukleových struktur CSD: Cambridge Structural Database databáze krystalů malých molekul placená MODBASE: Database of Comparative Protein Structure Models databáze modelů proteinů
Sekundární databáze SCOP: Structural Classification of Proteins hledání strukturních rodin proteinů CATH: hledání strukturních rodin proteinů GENE3D: strukturní genomika 3Dee Database of Protein Domain Definitions FSSP: Based on exhaustive all-against-all 3D structure comparison of protein structures currently in the Protein Data Bank (PDB) DALI: Fold Classification based on Structure-Structure Assignments
http://www.ebi.ac.uk/pdbe/ PDBe Souhrnná relační databáze macromolekulárních struktur
PDBe Example of an Atlas page, in this case for PDB entry 1E9F. Navigační menu sekvence anotovaná z dalších databází Uniprot CATH Pfam SCOP The Author(s) 2009. Published by Oxford University Press. Velankar S et al. Nucl. Acids Res. 2010;38:D308-D317
SIFTS format Structure Integration with Function, Taxonomy and Sequence Schematic overview of the process by which SIFTS files are generated (see text for details). The Author(s) 2009. Published by Oxford University Press. Velankar S et al. Nucl. Acids Res. 2010;38:D308-D317
PDBe služby BIObar PDBeStatus PDBeMapQuick PDBeView PDBeLite EMsearch PDBeChem PDBeMotif PDBePISA PDBeFold PDBeTemplate PDBeAnalysis OLDERADO PDBeMine Search system implemented as a toolbar application for Mozilla browsers Search system to query the status of PDB entries Quick access to cross-reference information to external databases based on PDB ID Text-based and advanced PDB search tool Search system based on the relational PDBe database Search system for the EM Database Ligand search using the PDB reference dictionary Query and analysis of structure, sequence motifs and interactions Search and analysis of Protein Interfaces, Surfaces and Assemblies Secondary Structure Matching (SSM) service for comparing protein structures in 3D Search of local residue interactions in the PDB Validation and analysis of PDBe data Clustering information for NMR entries in the PDB Supports ad-hoc queries and data analysis based on the relational PDBe database http://www.ebi.ac.uk/pdbe/docs/biobar.html http://www.ebi.ac.uk/pdbe-as/pdbstatus http://www.ebi.ac.uk/pdbe-as/pdbemapquick/ http://www.ebi.ac.uk/pdbe-srv/view http://www.ebi.ac.uk/pdbe-srv/pdbelite http://www.ebi.ac.uk/pdbe-srv/emsearch http://www.ebi.ac.uk/msd-srv/chempdb http://www.ebi.ac.uk/pdbe-site/pdbemotif/ http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html http://www.ebi.ac.uk/msd-srv/ssm/ http://www.ebi.ac.uk/pdbe-as/pdbetemplate/ http://www.ebi.ac.uk/pdbe-as/pdbevalidate http://www.ebi.ac.uk/pdbe/olderado/ http://www.ebi.ac.uk/pdbe-srv/msdmine
Biobar A toolbar search application for Mozilla/Netscape or firefox browsers http://biobar.mozdev.org/ Simple and quick retrieval of data from PDBe and 45 other Databases
Ligandy v PDB PDBeChem Vázané molekuly (např. cukry, lipidy, inhibitory, koenzymy and kofaktory) Unikátní 3 písmenný kód atom, element type, connectivity, bond orders, stereochemical configuration Hledání dle By ligand code By ligand name By formula By non-stereo SMILE By stereo SMILE By exact stereo structure By fingerprint similarity By fragment expression
PDBeMotif Hledání dle a) Ligands and their 3D environment b) protein families (SCOP, CATH, UNIPROT, EC-number) c) protein secondary structures and different 3D motifs (PROSITE, beta turn, catalytic sites etc.) d) protein Φ/Ψ angle sequences Výsledky: Example of a graphically defined query that can be submitted to PDBeMotif. a) Sequence multiple alignment b) 3D multiple alignment of fragments, motifs and protein chains. c) Interactions statistics d) Motifs characteristics and properties distribution charts. The Author(s) 2009. Published by Oxford University Press. Velankar S et al. Nucl. Acids Res. 2010;38:D308-D317
PDBe-site page Define search by ligand Define search by sequence motif (pattern) Define search by metal site geometry Define search by environment has same environment has similar environment Compare ligand environments. Analyze interactions between ligand and protein. Compare binding environment. Look for ligands within a certain environment. Superpose binding sites and ligands. Predict what could bind that empty pocket in your structure
PQS protein quarternary structure PDBePisa What assembly can my structure have? velmi obtížné získat predikcí krystalografie a EM
EMviewer The new EMViewer 3D visualization Java applet is available on the EMDB Atlas pages and allows interactive generation of isosurface representations. The Author(s) 2009. Published by Oxford University Press. Velankar S et al. Nucl. Acids Res. 2010;38:D308-D317
PDBsum
PDBSum Snaha mít všechny informace na jednom místě Dodatečné analýzy schéma sekundárních struktur Ligplot Schematic diagrams from the PDBsum Protein page for entry 1a5z: lactate dehydrogenase from Thermatoga maritima (16). 2008 The Author(s) Laskowski R A Nucl. Acids Res. 2009;37:D355-D359
PDBSum interfaces Extracts from the protein protein interaction diagrams in PDBsum for PDB entry 1mmo, a non-haem iron hydroxylase from Methylococcus capsulatus (17). 2008 The Author(s) Laskowski R A Nucl. Acids Res. 2009;37:D355-D359
NDB
NDB DNA RNA
NDB 3D struktura 2D struktura RNAview
CSD The Cambridge Structural Database www.ccdc.cam.ac.uk malé látky placená + pro výukové účely otevřený set 500 látek DB co? Total (2009) za rok CRYSTMET Metals, alloys, inorganics 119600 9000 ICSD Inorganics & Minerals 100200 9000 CSD Organics, Metal-Organics 488057 40000 NDB Nucleic Acids 3555 500 PDB Proteins 50730 6000
CSD - komponenty
WebCSD
Mercury Mercury visualiser Crystal structure visualisation program by CCDC Free Teaching subset embedded
A zpátky k proteinům...
Klasifikace struktur proteinů SCOP, CATH, FSSP, 3Dee Class: similar contents of secondary structures Architecture (Fold): structural similarity Superclass (Topology): probably same ancestor
SCOP Structural Classification of Proteins manual classification of protein structural domains based on similarities of their amino acid sequences and threedimensional structures. SCOP utilizes four levels of hierarchic structural classification: class - general "structural architecture" of the domain fold - similar arrangement of regular secondary structures but without evidence of evolutionary relatedness superfamily - sufficient structural and functional similarity to infer a divergent evolutionary relationship but not necessarily detectable sequence homology family - some sequence similarity can be detected. Murzin A. G., Brenner S. E., Hubbard T., Chothia C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536-540.
manually-curated hierarchical classification of protein domain structures. více automatizované, než SCOP Class secondary structure content (mainly-alpha, mainly-beta, mixed alpha/beta or 'few secondary structures'); Architecture general arrangement of the secondary structures irrespective of connectivity between them (e.g. alpha/beta sandwich); Topology (Fold) connectivity of secondary structures in the chain; Homologous Superfamily domains that are believed to be related by a common ancestor. S-levels automated clustering based on sequence identity. CATH
CATH
GENE3D Gene3D large collection of CATH protein domain assignments for ENSEMBL genomes and Uniprot sequences functional information, as well as taxonomic distributions, multi-domain architectures and protein-protein interaction (PPI) data.
FSSP - fold classification www2.emblebi.ac.uk/dali/fssp/ structurally superimposed proteins by (DALI) "Distance-matrix ALIgnment"
3Dee domény http://www.compbio.dundee.ac.uk/3dee/ Hierarchie jednotlivých domén klastrování dle strukturní podobnosti Dengler, U., Siddiqui, A. S. & Barton, G. J. (2001). Protein structural domains: Analysis of the 3Dee domains database. Proteins 42, 332-344. Siddiqui, A. S., Dengler, U. & Barton, G. J. (2001). 3Dee: A database of protein structural domains. Bioinformatics 17, 200-201.
Databáze, na které se nedostalo... Relibase protein-ligand interactions Modbase, SWISSModel repository, MMDB databáze modelů MolMovdb Macromolecular Motions database A spousta dalších většinou specifických pro daný problém např. jen pro cytochromy P450 CYPED, SuperCyp, Cytochrome P450 Homepage, Fungal CYP database, CYPallelles, Arabidopsis Cytochrome P450s, Cytochrome P450 Drug Interactions Table, a další. Pak nezbývá, než použít Google. :o)