CHAPTER 2Retrieval of Protein Sequence from UniProtKB
CS Mukhopadhyay and RK Choudhary
School of Animal Biotechnology, GADVASU, Ludhiana
2.1 INTRODUCTION
The Universal Protein Resource (UniProt) is a database of protein sequence and function, created by combining the Protein Information Resource‐Protein Sequence Database (PIR‐PSD), Swiss‐Prot, and TrEMBL databases. UniProt (www.uniprot.org/) has two sections: the Swiss‐Prot knowledgebase (it harbors fully annotated records) and the TrEMBL protein database (contains computationally analyzed records on proteins).
2.1.1 Features of UniProtKB/Swiss‐Prot
- Non‐redundancy of records.
- High level of integration of data deposited in different related databases (NCBI‐GenBank, EMBL, DDBJ for translated coding sequences).
- High level of manual curation.
- Contains more than 0.25 million entries.
2.1.2 Features of UniProtKB/TrEMBL
- Translations of nucleotide coding sequence (cds) in EMBL/NCBI‐GenBank/DDBJ.
- Automatic annotation.
- Contains more than 3.3 million entries.
2.2 OBJECTIVE
To download the amino acid sequence of protein (say, taurine sex‐determining region, Y‐encoded (SRY) peptide).
2.3 PROCEDURE
- Open the Expert Protein Analysis System (ExPASy) homepage: http://www.expasy.org/
- Locate the browser on the drop‐down menu “Query all databases” at the upper center portion of the page, and click on “Proteomics” (to obtain information from all relevant databases, such as Prosite, String, ENZYME, UniProtKB etc), or else select UniProtKb ...
Get Basic Applied Bioinformatics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.