CHAPTER 9Visualizing the Genome
In the previous chapter, we performed quality control steps on the FASTQ file delivered by the sequencing facility, and we aligned all the sequences of that file to the human genome reference. We are only halfway through our analysis, as we still have to, first, “call the variants” (that is, formally identify all the differences between our genome and the genome reference) and, second, annotate these variants in order to find out their impact—such as, do they change the protein encoded by a gene? Are they common or rare? Benign or potentially pathogenic? And so forth.
Before moving to these next steps, it is helpful to visualize the result of the alignment of our genome to the human genome reference. It will help us to better understand the process of genome analysis, as well as how our genome works. We will use a genome visualizer for this step.
Introducing Genome Visualizers
What is a genome visualizer? It is a graphical tool that allows you to visualize genomic data. The sequence of a genome is usually displayed as a horizontal line. Genomic data associated with that sequence is aligned to the sequence and organized as “tracks” above or below the genome sequence. For example, tracks can include RNA or protein information, DNA modifications, transcription factors binding sites, degree of complexity of the sequence, degree of conservation of the sequence versus other species, or alignments of sequenced reads to that genome sequence.
Some genome ...
Get Genomics in the AWS Cloud now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.