This chapter is discussed in three sections:
- Section I – Quality filtering of data using PRINSEQ
- Section II – Identification of Differentially expressed genes – I (Using Cufflinks)
- Section III – Identification of Differentially expressed genes – II (Using RSEM–DE packages – EBSeq, DESeq2, and edgeR)
44.1 SECTION I. QUALITY FILTERING OF DATA USING PRINSEQ
The data generated from most of the platforms are in FASTQ format (i.e., base call data). The data files for this chapter are designated as control.fastq and infected.fastq. Both the fastq files have paired end reads. These data need to be initially checked and quality trimmed for further use. The most commonly used program for quality filtering/trimming is prinseq‐lite.pl. There are several options in Prinseq‐lite for data trimming and/or filtering. First, trimming is done, followed by execution of the filtering commands. Trimming is commonly done to remove the adapter sequences present in the raw data generated. It is also used to remove the poly A tail at the end of the read.
44.1.2 Quality check analyses using PRINSEQ
From a data set, summary statistics, filtered, reformatted and trimmed quality data can be generated using PRINSEQ. This can be used for all types of sequence data. PRINSEQ can be accessed through a web interface or can be used, standalone. ...