From Wikipedia, the free encyclopedia

Protein Databases Information

Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. Searching databases is often the first step in the study of a new protein. Comparison between proteins or between protein families provides information about the relationship between proteins within a genome or across different species, and hence offers much more information than can be obtained by studying only an isolated protein. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand the structure and function of a protein. Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet.

Keywords: Bioinformatics, Biological Databases, Protein Analysis, Protein Modeling INTRODUCTION Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. These data cannot be handled without using computer databases. Searching databases is often the first step in the study of a new protein. Without the prior knowledge obtained from such searches, known information about the protein could be missed, or an experiment could be repeated unnecessarily. Comparison between proteins and protein classification provide information about the relationship between proteins within a genome or across different species, and hence offer much more information than can be obtained by studying only an isolated protein. In this sense, protein comparison through databases allows one to view life as a forest instead of individual trees. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand evolution, structure, and function of a protein.

Protein databases are especially powered by the Internet. Unlike traditional media, such as the CD-ROM, the Internet allows databases to be easily maintained and frequently updated with minimum cost. Researchers with limited resources can afford to set up their own databases and disseminate their data quickly. Notably, many small databases on specific types of proteins, such as the EF-Hand Calcium-Binding Proteins Data Library ( http://structbio.vanderbilt.edu/cabp_database/), are widely available. Users worldwide can easily access the most up-to-date version through a user-friendly interface. Most protein databases have interactive search engines so that users can specify their needs and obtain the related information interactively. Many protein databases also allow submitters to deposit data, and database servers can check the format of the data and provide immediate feedback.

Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet. Databases for different aspects of proteins are discussed with the focus on sequence, structure, and family. The strengths and weaknesses of the databases are addressed. For Web addresses of the databases discussed in this unit, see Internet Resources and Table 19.4.1. From hundreds of on-line protein databases, several major databases are discussed as examples to illustrate their features and how they can be used effectively. Most other protein databases can be explored in a similar way.Cite error: A <ref> tag is missing the closing </ref> (see the help page). (PDB) was established in 1971 as the central archive of all experimentally determined protein structure data. Today the PDB is maintained by an international consortia collectively known as the Worldwide Protein Data Bank (wwPDB). The mission of the wwPDB is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community.

Database of Macromolecular Movements

The Database of Macromolecular Motions (molmovdb) is a bioinformatics database and software-as-a-service tool that attempts to categorize macromolecular motions, sometimes also known as conformational change. [1] [2] [3] It was originally developed by Mark B. Gerstein, Werner Krebs, and Nat Echols in the Molecular Biophysics & Biochemistry Department at Yale University. [4] [5]

It attempts to systematize all instances of protein and nucleic acid movement for which there is at least some structural information. At present it contains >120 motions, most of which are of proteins. The database contains plausible representations for motion pathways, derived from restrained 3D interpolation between known endpoint conformations. These pathways can be viewed in a variety of movie formats, and the database is associated with a server that can automatically generate these movies from submitted coordinates. [6]

Dynameomics

Dynameomics [7] is a continuing project in the Daggett group [8] to characterize the native state dynamics and the folding / unfolding pathway of representatives from all known protein folds by molecular dynamics simulation. It harbours molecular dynamics simulations of the native state and unfolding pathways of over 2000 protein/peptide systems (approximately 11,000 independent simulations) representing the majority of folds in globular proteins. These data are stored and organized in such a manner which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins.

JenaLib

The Jena Library of Biological Macromolecules (JenaLib) [9] is aimed at a better dissemination of information on analysis. It provides access to all structure entries deposited at the Protein Data Bank (PDB) or at the Nucleic Acid Databank [10] (NDB). In addition, basic information on the architecture of biopolymer coordinates is available. This includes:

(1) Atlas pages and entry lists.

(2) PDB sequence information extracted from atomic coordinates [11].

(3) PDB/ UniProt sequence alignments that clearly indicate gaps, mutations, numbering irregularities and modified residues.

(4) Integration of data on single amino acid polymorphisms [12] (SAPs), PROSITE motifs, exon structure and SCOP/ CATH/ Pfam domains with PDB, GO and taxonomy information.

(5) Display of these data in the sequence/alignment viewer and in the Jmol based molecule viewer Jena3D [13]; in the latter case both for asymmetric and biological units.

(6) A QuickSearch option that allows searching for PDB/NDB code, UniProt ID/accession number and other search terms in one input field.

(7) A sequence homology search ( BLAST) and pattern search options.

(8) SCOP/CATH/Pfam tree browsers.

ModBase

ModBase is a database of annotated comparative protein structure models, containing models for more than 3.8 million unique protein sequences. [14] Models are created by the comparative modeling pipeline ModPipe which relies on the MODELLER program. ModBase is developed in the laboratory of Andrej Sali at UCSF. ModBase models are also accessible through the Protein Model Portal for fold assignment, sequence–structure alignment, model building and model assessment. ModBase currently contains 10 355 444 reliable models for domains in 2 421 920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb [15] modeling server. ModBase models are available through the ModBase interface as well as the Protein Mod assessment.

OCA

A browser-database for protein structure/function - The OCA [16] integrates information from KEGG, OMIM, PDBselect, Pfam, PubMed, SCOP, SwissProt, and others. Its a powerful alternative mechanism for searching the world structure database in the Protein Datbilaye. OCA provides rich content annotation on structure and function, generating dynamic links to these external sources. This database offers simple search, FASTA search or many options for additional searches. It also allows the user to save the generated search results.

PDBsum

PDBsum is a database that provides an overview of the contents of each 3D macromolecular structure deposited in the Protein Data Bank. [17] [18] [19] [20] The original version of the database was developed around 1995 by Roman Laskowski and collaborators at University College London. [21] As of 2014, PDBsum is maintained by Laskowski and collaborators in the laboratory of Janet Thornton at the European Bioinformatics Institute (EBI).

It includes images of the structure, annotated plots of each protein chain’s secondary structure detailed structural analyses generated by the PROMOTIF [22] program, summary PROCHECK [23] results and schematic diagrams of protein–ligand and protein–DNA interactions. RasMol scripts highlight key aspects of the structure, such as the protein’s domains, PROSITE patterns and protein–ligand interactions, for interactive viewing in 3D. Numerous links take the user to related sites. PDBsum is updated whenever any new structures are released by the PDB and is freely accessible. [24]

PDBTM

The Protein Data Bank of Transmembrane Proteins is the comprehensive and up-to-date trans membrane protein selection of the Protein Data Bank (PDB). PDBTM database is maintained at the Institute of Enzymology by the Membrane Protein Bioinformatics Research Group. The PDBTM database was created by scanning all PDB entries with the TMDET [25] algorithm that is able to distinguish between transmembrane and monograms membrane proteins using their 3D atomic coordinates only. The TMDET algorithm can locate the spatial positions of transmembrane proteins in lipid bilayer. Since its release in 2004 numerous exotic transmembrane protein structure have been solved and the database entries have increased from 400 to 17000. [26] [27]

ProtCID

The Protein Common Interface Database (ProtCID) [28] is a database of similar protein-protein interfaces in crystal structures of homologous proteins. [29]

Its main goal is to identify and cluster homodimeric and heterodimeric interfaces observed in multiple crystal forms of homologous proteins. Such interfaces, especially of non-identical proteins or protein complexes, have been associated with biologically relevant interactions. [30]

A common interface in ProtCID indicates chain-chain interactions that occur in different crystal forms. All protein sequences of known structure in the Protein Data Bank (PDB) [31] are assigned a ” Pfam chain architecture”, which denotes the ordered Pfam [32] assignments for that sequence, e.g. (Pkinase) or (Cyclin_N)_(Cyclin_C). Homodimeric interfaces in all crystals that contain a particular architecture are compared, regardless of whether there are other protein types in the crystals. All interfaces between two different Pfam architectures in all PDB entries that contain them are also compared (e.g., (Pkinase) and (Cyclin_N)_(Cyclin_C) ). For both homodimers and heterodimers, the interfaces are clustered into common interfaces based on a similarity score.

ProtCID reports the number of crystal forms that contain a common interface, the number of PDB entries, the number of PDB and PISA [33] biological assembly annotations that contain the same interface, the average surface area, and the minimum sequence identity of proteins that contain the interface. ProtCID provides an independent check on publicly available annotations of biological interactions for PDB entries.

Protein

The NIH protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and Third Party Annotation, as well as records from SwissProt, PIR, PRF, and PDB. [34] [35] [36]

Proteopedia

Proteopedia is a 3D encyclopedia of proteins and other molecules. [37] [38] [39] [40] The site contains a page for every entry in the Protein Data Bank (>130,000 pages), as well as pages that are more descriptive of protein structures in general such as acetylcholinesterase, [41] hemoglobin, [42] and the photosystem II [43] with a Jmol view that highlights functional sites and ligands. Currently, Proteopedia has 148,468 articles and contains one page for every entry in the World Wide Protein Data Bank. It employs a scene-authoring tool so that users do not have to learn JSmol script language tk create customized molecular scenes. [44]

123D+

123D+ threads a sequence through a set of 3D structures. It combines sequence profiles, secondary structure prediction, and contact capacity potentials to find the most compatible fold among the 3D structures, and the best alignment of the sequence with that fold.

Columba-DB: Protein Structure Annotation

This meta-server provides an integrated summary with links to details for information from the PDB, KEGG, ENZYME, ExPASy, DSSP, CATH, SCOP, SwissProt, NCBI Taxonomy, GO, and PISCES. The database can be searched using either keyword search or data source-specific web forms.Users can thus quickly select and download PDB entries that, are classified as containing a certain CATH architecture, are annotated as having certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine readable extensible language and human-readable format. The structures themselves can be viewed interactively on the web. [45]

CASTp

CASTp is a server that identifies pockets and cavities in proteins, and quantitates their volumes. Atoms lining each pocket or cavity can be displayed in Chime, RasMol, or MAGE. CASTp can be used to study surface features and functional regions of proteins. It includes a graphical interface, flexible interactive visualization, as well as on-the-fly calculation for user uploaded structures.

Conformational Epitope Prediction Server

It predicts possible antigenic epitopes on surfaces of protein antigen structures submitted. Displays predicted epitopes in Jmol. CEP server provides a web interface to the conformational epitope prediction algorithm developed in-house. The algorithm, apart from predicting conformational epitopes, also predicts antigenic determinants and sequential epitopes. The epitopes are predicted using 3D structure data of protein antigens, which can be visualized graphically. The algorithm employs structure-based Bioinformatics approach and solvent accessibility of amino acids in an explicit manner. [46]

DisEMBL

DisEMBL is a computational tool [47] for prediction of disordered/unstructured regions within a protein sequence. As no clear definition of disorder exists, it has developed parameters based on several alternative definitions, and introduced a new one based on the concept of "hot loops, i.e. coils with high temperature factors. Avoiding potentially disordered segments in protein expression constructs can increase expression, foldability and stability of the expressed protein. DisEMBL is thus useful for target selection and the design of constructs as needed for many biochemical studies, particularly structural biology and structural genomics projects. [48]

DisProt

"Database of Protein Disorder (DisProt) is a curated database that provides information about proteins that lack fixed 3D structure in their putatively native states, either in their entirety or in part." It is a database of experimental evidences of disorder manually collected from literature. Each evidence is identified by one experiment, the corresponding paper and the position in the sequence. When multiple experiments are available in a single paper, DisProt reports multiple evidences (even if experiments are about the same region). [49]

FSSP

FSSP (families of structurally similar proteins) is a database of structural alignments of proteins in the Protein Data Bank (PDB). The database currently contains an extended structural family for each of 330 representative protein chains. Each data set contains structural alignments of one search structure with all other structurally significantly similar proteins in the representative set (remote homologs, < 30% sequence identity), as well as all structures in the Protein Data Bank with 70-30% sequence identity relative to the search structure (medium homologs). Very close homologs (above 70% sequence identity) are excluded as they rarely have marked structural differences. The alignments of remote homologs are the result of pairwise all-against-all structural comparisons in the set of 330 representative protein chains. All such comparisons are based purely on the 3D co-ordinates of the proteins and are derived by automatic (objective) structure comparison programs. The significance of structural similarity is estimated based on statistical criteria. [50]

SCOP and SCOP2

The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and distant evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database so far. SCOP2 is a successor of SCOP. Similarly to SCOP, the main focus of SCOP2 is on proteins that are structurally characterized and deposited in the PDB. Proteins are organized according to their structural and evolutionary relationships, but, in contrast to SCOP, instead of a simple tree-like hierarchy these relationships form a complex network of nodes. Each node represents a relationship of a particular type and is exemplified by a region of protein structure and sequence. SCOPe is a database developed at the Berkeley Lab and UC Berkeley that extends SCOP (version 1). SCOPe classifies many structures released since SCOP 1.75 through a combination of automation and manual curation, and corrects some errors, aiming to have the same accuracy as the fully hand-curated SCOP releases. SCOPe also incorporates and updates the Astral database.

ASTRAL

The ASTRAL compendium provides databases and tools useful for analyzing protein structures and their sequences. It is partially derived from, and augments the SCOP: Structural Classification of Proteins database. Most of the resources provided here depend upon the coordinate files maintained and distributed by the Protein Data Bank.

Protein Structure Tools

CAPRI

Critical Assessment of Predicted Interactions allows to assess the capacity of protein-protein docking methods to predict protein-protein interactions. CAPRI is a community wide experiment designed to assess those that are based on structure. Its targets are unpublished crystal or NMR structures of complexes, communication on a confidential basis by their authors to the CAPRI management. Participant predictor group are given the atomic coordinates of two proteins that make biologically relevant interactions.

They model the target complex with the help of the coordinates and other publicly available data (sequence, mutations etc), and subunit sets of ten models for assessments on the CAPRI website. After the prediction round is completed, the CAPRI assessors compare the submissions to the experimental structure, evaluate the models on criteria that depend on the geometry and biological relevance of the predicted interactions.

Comparative Modeling (Homology Modeling) Servers

They are continuously and automatically evaluated by EVA. There is also a structure prediction meta-server for difficult cases, the BioInfoBank Meta Server. For straightforward cases, comparative modeling is automated by SWISS-MODEL.

SWISS MODEL

SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer). SWISS-MODEL consists of three tightly integrated components: (1) The SWISS-MODEL pipeline – a suite of software tools and databases for automated protein structure modelling.

(2) The SWISS-MODEL Workspace – a web-based graphical user workbench.

(3) The SWISS-MODEL Repository – a continuously updated database of homology modeServer a set of model organism proteomes of high biomedical interest.

BioInfoBank Meta Server

The BioInfoBank Meta Server offers a gateway to well-benchmarked protein structure and function prediction methods. Structural models collected from the prediction servers are assessed using the powerful 3D-jury consensus approach.

ConSurf

he ConSurf server (Glaser et al., 2003; Landau et al., 2005; Ashkenazy et al., 2010; Celniker et al., 2013; Ashkenazy et al., 2016) is a bioinformatics tool for estimating the evolutionary conservation of amino/nucleic acid positions in a protein/DNA/RNA molecule based on the phylogenetic relations between homologous sequences. The degree to which an amino (or nucleic) acid position is evolutionarily conserved (i.e., its evolutionary rate) is strongly dependent on its structural and functional importance. Thus, conservation analysis of positions among members from the same family can often reveal the importance of each position for the protein (or nucleic acid)'s structure or function. In ConSurf, the evolutionary rate is estimated based on the evolutionary relatedness between the protein (DNA/RNA) and its homologues and considering the similarity between amino (nucleic) acids as reflected in the substitutions matrix (Pupko et al., 2002; Mayrose et al., 2004).

Dali

The Dali server is a network service for comparing protein structures in 3D. One submits the coordinates of a query protein structure and Dali compares them against those in the Protein Data Bank (PDB). In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. User can perform four types of structure comparisons:

(1) Heuristic PDB search - compares one query structure against those in the PDB.

(2) Exhaustive PDB25 search - compares one query structure against a representative subset of the Protein Data Bank.

(3)Pairwise structure comparison - compares one query structure against those specified by the user.

(4)All against all structure comparison - returns a structural similarity dendrogram for a set of structures specified by the user. Dhruv Ch ( talk) 12:44, 4 January 2018 (UTC)

  1. ^ "Hot Picks". Science. 284 (5416): 871–871. 1999-05-07. doi: 10.1126/science.284.5416.871b. ISSN  0036-8075.
  2. ^ Bourne PE, Helge W, eds. (2003). Structural Bioinformatics. Hoboken, NJ: Wiley-Liss. p. 229. ISBN  978-0-471-20199-1. OCLC  50199108.
  3. ^ Bourne,PE; Murray-Rust, J; Lakey JH (Feb 1999). "Protein-nucleic acid interactions Folding and binding Web alert". Current Opinion in Structural Biology. 9 (1): 9–10. doi: 10.1016/S0959-440X(99)90000-3.
  4. ^ "Morphs". Proteopeida. Proteopeida. Retrieved 2015-10-30.
  5. ^ Borner (ed.). Knowledge Management and Visualization Tools in Support of Discovery (NSF Workshop Report) (PDF) (Report). National Science Foundation Workshop. p. 5.
  6. ^ http://www.ncbi.nlm.nih.gov/m/pubmed/9722650/
  7. ^ http://www.dynameomics.org
  8. ^ http://depts.washington.edu/daglab/
  9. ^ http://jenalib.leibniz-fli.de/
  10. ^ http://ndbserver.rutgers.edu/
  11. ^ http://proteopedia.org/wiki/index.php/Atomic_coordinate_file
  12. ^ http://mammoth.psu.edu/SAP.html
  13. ^ http://jena3d.leibniz-fli.de/
  14. ^ a database of annotated comparative protein structure models, and associated resources|journal = Nucleic Acids Res.|volume=39|issue=Database issue|pages=D465-74|publisher= |location = England| issn = | pmid = 21097780|doi = 10.1093/nar/gkq1091| bibcode = | oclc =| id = | url = | pmc =3013688 | format = | accessdate = | laysummary = | laysource = | laydate = | quote = }}
  15. ^ https://modbase.compbio.ucsf.edu/modweb/
  16. ^ http://oca.weizmann.ac.il/oca-bin/ocamain
  17. ^ a Web-based database of summaries and analyses of all PDB structures | journal = Trends in Biochemical Sciences | volume = 22 | issue = 12 | pages = 488–90 | date = Dec 1997 | pmid = 9433130 | doi = 10.1016/S0968-0004(97)01140-7 }}
  18. ^ Laskowski RA (Jan 2001). "PDBsum: summaries and analyses of PDB structures". Nucleic Acids Research. 29 (1): 221–2. doi: 10.1093/nar/29.1.221. PMC  29784. PMID  11125097.
  19. ^ Laskowski RA, Chistyakov VV, Thornton JM (Jan 2005). "PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids". Nucleic Acids Research. 33 (Database issue): D266-8. doi: 10.1093/nar/gki001. PMC  539955. PMID  15608193.
  20. ^ Laskowski RA (Jan 2009). "PDBsum new things". Nucleic Acids Research. 37 (Database issue): D355-9. doi: 10.1093/nar/gkn860. PMC  2686501. PMID  18996896.
  21. ^ "PDBsum documentation: About PDBsum". European Molecular Biology Laboratory – The European Bioinformatics Institute. Retrieved 9 September 2014.
  22. ^ http://www.img.bio.uni-goettingen.de/ms-www/internal/manuals/promotif/promotif.html
  23. ^ https://www.ebi.ac.uk/thornton-srv/software/PROCHECK/
  24. ^ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC29784/
  25. ^ http://tmdet.enzim.hu/
  26. ^ http://pdbtm.enzim.hu/
  27. ^ https://www.ncbi.nlm.nih.gov/m/pubmed/23203988/
  28. ^ http://dunbrack2.fccc.edu/ProtCiD/Default.aspxis
  29. ^ Xu, Q.; Dunbrack, R. L. (2010). "The protein common interface database (ProtCID)—a comprehensive database of interactions of homologous proteins in multiple crystal forms". Nucleic Acids Research. 39 (Database issue): D761–70. doi: 10.1093/nar/gkq1059. PMC  3013667. PMID  21036862.
  30. ^ Xu, Qifang; Canutescu, Adrian A.; Wang, Guoli; Shapovalov, Maxim; Obradovic, Zoran; Dunbrack, Roland L. (2008). "Statistical Analysis of Interface Similarity in Crystals of Homologous Proteins". Journal of Molecular Biology. 381 (2): 487–507. doi: 10.1016/j.jmb.2008.06.002. PMC  2573399. PMID  18599072.
  31. ^ Berman, H. M.; Battistuz, T.; Bhat, T. N.; Bluhm, W. F.; Bourne, P. E.; Burkhardt, K.; Feng, Z.; Gilliland, G. L.; Iype, L.; Jain, S.; Fagan, P.; Marvin, J.; Padilla, D.; Ravichandran, V.; Schneider, B.; Thanki, N.; Weissig, H.; Westbrook, J. D.; Zardecki, C. (2002). "The Protein Data Bank". Acta Crystallographica Section D. 58 (Pt 6 No 1): 899–907. doi: 10.1107/S0907444902003451. PMID  12037327.
  32. ^ Punta, M.; Coggill, P. C.; Eberhardt, R. Y.; Mistry, J.; Tate, J.; Boursnell, C.; Pang, N.; Forslund, K.; Ceric, G.; Clements, J.; Heger, A.; Holm, L.; Sonnhammer, E. L. L.; Eddy, S. R.; Bateman, A.; Finn, R. D. (2011). "The Pfam protein families database". Nucleic Acids Research. 40 (Database issue): D290–D301. doi: 10.1093/nar/gkr1065. PMC  3245129. PMID  22127870.
  33. ^ Krissinel, E.; Henrick, K. (2007). "Inference of Macromolecular Assemblies from Crystalline State". Journal of Molecular Biology. 372 (3): 774–797. doi: 10.1016/j.jmb.2007.05.022. PMID  17681537.
  34. ^ https://www.ncbi.nlm.nih.gov/protein/
  35. ^ https://www.ncbi.nlm.nih.gov/genbank/tpa/
  36. ^ https://www.prf.or.jp/index-e.html
  37. ^ Hodis E, Prilusky J, Martz E, Silman I, Moult J, Sussman JL (2008). "Proteopedia - a scientific 'wiki' bridging the rift between three-dimensional structure and function of biomacromolecules". Genome Biol. 9 (8): R121. doi: 10.1186/gb-2008-9-8-r121. PMC  2575511. PMID  18673581.{{ cite journal}}: CS1 maint: unflagged free DOI ( link)
  38. ^ Martz E (2009). "Proteopedia.Org: a scientific "Wiki" bridging the rift between 3D structure and function of biomacromolecules". Biopolymers. 92 (1): 76–7. doi: 10.1002/bip.21126. PMID  19117028.
  39. ^ Hodis E, Prilusky J, Sussman JL (2010). "Proteopedia: A collaborative, virtual 3D web-resource for protein and biomolecule structure and function". Biochem. Mol. Biol. Educ. 38 (5): 341–2. doi: 10.1002/bmb.20431.
  40. ^ Prilusky, J; Hodis, E.; Canner, D.; Decatur, W. A.; Oberholser, K.; Martz, E.; Berchanski, A.; Harel, M.; Sussman, J. L. (Aug 2011). "Proteopedia: A status report on the collaborative, 3D web-encyclopedia of proteins and other biomolecules". Journal of Structural Biology. 175 (2): 244–252. doi: 10.1016/j.jsb.2011.04.011. PMID  21536137.
  41. ^ "Acetylcholinesterase". Proteopedia.
  42. ^ "Hemoglobin". Proteopedia.
  43. ^ "Photosystem II". Proteopedia.
  44. ^ https://www.Proteopedia.org
  45. ^ https://www.ncbi.nlm.nih.gov/pubmed/15801979
  46. ^ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160221/
  47. ^ http://www.nature.com/nrg/series/computational/index.html?foxtrotcallback=true
  48. ^ http://dis.embl.de/
  49. ^ http://www.disprot.org/
  50. ^ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC308329/