CHAPTER 4
PROMOTER RECOGNITION USING NEURAL NETWORK APPROACHES
4.1 INTRODUCTION
Currently, huge amount of genome data is available due to fast sequencing methods. Similar fast annotation methods of the genome are not available and current technologies consume a lot of time. Hence, machine annotation methods are required to tackle the major problems of promoter recognition and gene recognition.
Promoters occur upstream of a gene and are regions at which ribonucleic acid (RNA) polymerase binds and initiates transcription. Promoters also act as switches specifying the location in the organism, as well as the time at which the transcription can occur at that gene. The location where transcription begins is known as the transcription start site (TSS). A majority of the promoters of genes that transcribe large amounts of messenger RNA (mRNA) have a set of binding sites or regions [1,2]. One of these sites is a TATA sequence, a hexamer, upstream from TSS. Promoter also contains one or more binding regions further upstream and downstream. The eukaryotic and prokaryotic promoter recognition problems have to be dealt with independently. For example, the promoter structure for Escherichia coli has two binding regions present at -10 and -35 positions with respect to TSS (position of which is taken as +1). These are indicated as a -35 motif and a -10 motif. The patterns at these binding sites are known to be conserved. In general, patterns ...