8

Using the Protein Data Bank

Proteomics is the study of proteins, including their function and structure. One of the main objectives of this field is to characterize the three-dimensional structure of proteins. One of the most widely known computational resources in the proteomics field is the Protein Data Bank (PDB), a repository with the structural data of large biomolecules. Of course, many databases focus on protein primary structure instead; these are somewhat similar to the genomic databases that we saw in Chapter 2, Getting to Know NumPy, pandas, Arrow, and Matplotlib.

In this chapter, we will mostly focus on processing data from the PDB. We will look at how to parse PDB files, perform some geometric computations, and visualize molecules. ...

Get Bioinformatics with Python Cookbook - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.