Errata

Errata for Beginning Perl for Bioinformatics

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
Printed	Page 1 1	Can't download the examples and answers, Bro!	Anonymous
Printed	Page 55 Exercise 4.5	It seems to me that there is misunderstainig of how transcription occurs (DNA to RNA or RNA to DNA). Will appreciate your feed-back. Thanks Hemant # Perl: Exercise 4-5 Reverse Transcribing RNA into DNA # The RNA $RNA = 'ACGGGAGGACGGGAAAAUUACUACGGCAUUAGC'; print " $RNA "; # Transcribe RNA to DNA - Replace 'U' where there is 'T'. # However, transcription occurs A -> T, U -> A, C -> G, G -> C # So the correct answer is as below given the above RNA structure. $DNA = $RNA; $DNA =~ tr/ACGU/TGCA/; print " $DNA "; # The correct DNA seq is TGCCCTCCTGCCCTTTTAATGATGCCGTAATCG # The code below is incorrect $DNA = $RNA; $DNA =~ s/U/T/g; print " $DNA "; # result is ACGGGAGGACGGGAAAATTACTACGGCATTAGC exit;	Anonymous
Printed	Page 85 Exercise 5.6	I believe the previous reader may have been confused by the use of @ARGV in the provided solution, which is not introduced until page 98. Also, it is possible that submitting the DNA strings in lowercase format could have led to the problem, since the program will only work for uppercase sequences.	Anonymous	May 31, 2011
Printed	Page 107 for each loop in the code	When I ran example 6-4.pl after fixing the two bugs described in the text, the program still did not generate the correct output. It seems that the variable $receivingcommittment was never set to 1. It turned out that the variable name was misspelt as "$recieving..." whereas it should be "$receiving...". Further correction of the variable name would fix the problem.	Anonymous
Printed	Page 117 exercise 6.5	there is no argument passed when the subroutine is called, therefore the printout statement is always executed, even if the file doesn't exist It should be : if(file_passes_tests($file)) { print "File $file exists, is a regular file, and is nonzero in size "; }	Anonymous
Printed	Page 132 first sub-routine (second line)	As the list of nucleotides (A/C/G/T)is specifically stated in the sub-routine 'randomnucleotide' (on Page 133) it seems supefluous to also specifically name them in this sub-routine ('mutate') and to pass them to the second sub-routine as a parameter which isn't used.	Anonymous
Printed	Page 143 last paragraph	Hi, When I run the subroutine, the error message show that: syntax error at c7_s4.pl line 76, near ") {" Global symbol "$count" requires explicit package name at c7_s4.pl line 78. Global symbol "$length" requires explicit package name at c7_s4.pl line 78. syntax error at c7_s4.pl line 79, near "}" Execution of c7_s4.pl aborted due to compilation errors. ##################################### sub match_percentage { my ($string1,$string2) =@_; #assume the two strings with same length my $length=length($string1); my ($position); my ($count) =0; for ($position=0; $position < $length; ++$position) { if(substr($string1, $position, 1) eq (substr($string2, $position, 1)) {++$count;} } return $count/$length; }	Anonymous
Printed	Page 146 3rd paragraph	The output of example 7-4 contains "matching positions is 0.24%" and the accompaning text says "a quarter of the positions match". This would be try if it said 24% or 0.24. 0.24% is a quarter of a percent, not 25 percent. Something is wrong here.	Anonymous
Other Digital Version	148 exercise 7.5	In the answer of the exercise07.05, the subroutine mutate_codon says: sub mutate_codon { my($codon) = @_; my @bases = qw(A C G T); my $position = int rand 3; my $base = $bases[$position]; my $newbase; do { $newbase = $bases[rand @bases]; } until ($newbase ne $base); substr($codon, $position, 1) = $newbase; return $codon; } which is not correct. If the author ran this exercise several times, he would realized that sometimes the result printed says AAC mutates to AAC The error comes in the line saying: my $base = $bases[$position]; where the author uses $position to select the corresponding position in the codon... but he is using the array with the baes instead. The correct subroutine should be: sub mutate_codon { my($codon) = @_; my @bases = qw(A C G T); my $position = int rand 3; my $base = substr($codon,$position,1); my $newbase; do { $newbase = $bases[rand @bases]; } until ($newbase ne $base); substr($codon, $position, 1) = $newbase; return $codon; }	Juan	Jan 02, 2012
Printed	Page 185 2nd paragraph	it will look for restriction enzymes .... the restriction enzymes appear. -> it will look for restriction sites .... the restriction sites appear.	Anonymous
Printed	Page 191 Example 9.2	In the (errata) correction of this example (changing from a foreach loop over an array which has been read in, to a while look which reads in the array - so the range statement will work) use is made of the open_file() subroutine. I didn't remember seeing this subroutine, and it isn't mentioned in the Index (either under its name, or under subroutines). It is on page 218. The location should be mentioned both where it is used, and in the Index.	Anonymous
Printed	Page 198 Exercise 9.6	On Line 95 of origianl answer: for ( my $i = 1, my $j = shift(@locations) ; @locations ; $i = $j, $j = shift(@locations) ) { push(@digest, substr($dna, $i-1, $j-$i)); } using this for loop, it will miss the last restriction digest because after getting the last ensyme site, @locations will be empty, then the loop will stop. The right for loop should like this: for ( my $i = 1, my $j = shift(@locations), my $k = 0; $k <= scalar(@locations)+2 ; $i = $j, $j = shift(@locations) ) { $k++; push(@digest, substr($dna, $i-1, $j-$i)); }	Anonymous
Printed	Page 203 3	ftp://ncbi.nlm.nih.gov/genbank/gbrel.txt is given as the location for finding gbrel.txt which is the Genebank release notes, is not correct (or at least not working at the moment) ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt does work.	Anonymous
Printed	Page 211 near bottom of page	the following code in Example 10-2 ($annotation, $dna) = ($record =~ /^(LOCUS.ORIGINs )(.*)// /s); generates an error (uninitialized value <GBFILE> chunk 1) on my mac, using MacPerl	Anonymous
Printed	Page 219 sub get_annotation_and_dna	The final statement return ($annotation, $dna) needs a ';'	Anonymous
Printed	Page 221 6	Using a hash for annotations is a great idea except in cases where an annotation type occurs more than once in a Genbank record. I have seen many cases of Genbank records with multiple REFERENCE annotations. I was hoping that the author would point this out and have another example showing a hash whose values were arrays of strings.	Anonymous
Printed	Page 221 example 10.5	i have spent extraordinary effort trying to parse the elements of the Features of Genbank files ... a proper answer to Exercise 10.5 would have been wonderfully helpful ... it's disingenuous to fail to provide an answer and to say that "it makes a good class project" when this book should be designed for individuals who have no teacher; and to state that it is "straighforward but challenging" is a contradiction in terms ... in fact, it is exactly what i want to be able to do, and have not yet succeeded with after a great deal of effort # Answer to Exercise 10.5 # # The answer to this exercise is left to the student, as it makes a good class project. It is a straightforward but challenging extension of material already presented in the text; it also can be the basis of interesting and biologically focused projects. # # Good luck with it!	Anonymous
Printed	Page 222 bottom	This code: while ( $annotation =~ /^[A-Z].* (^s.* )*/gm) generates a segmentation fault, when the code runs on any real genbank file, such as hs_ref_chr22.gbs or hs_ref_chr22.gbk	Anonymous
Printed	Page 223 1	Example 10-6, Parsing GenBank Annotation, which begins on page 221, produces incorrect results on pages 223 and 224. In particular, the parse_annotation() subroutine does not check to see if the 'field' ($key:$value) it is about to store in the hash table has already been stored. As a result, previous occurrences of a particular field are clobbered and only the last occurrence is recorded. In the example given, with the input taken from page 201, only the second "REFERENCE" field is displayed (page 224). Interestingly, the very next section on parsing the "FEATURES" table warns on page 228 about the possibility of running into this scenario when parsing the FEATURES table's multiple fields - some of which have the same name. The same coding solution should have been applied to the entire GenBank record.	Anonymous
Printed	Page 241 last paragraph	. .. 3c 44 pdb1a4o.ent -> . .. 3c 44 c1 c4 pdb1a4o.ent Also, you have to make this change on p.243 244 246 247	Anonymous
Printed	Page 288 code at bottom of page	As noted in another "confirmed" error report, there is an error in the code found at the bottom of page 288. However, I believe the solution is still in error. In particular, while the proposed solution (adding parentheses to the regular expressions; e.g. changing /^Query.* / to /^Query(.) / and /^Sbjct. / to /^Sbjct(.*) /) may correct an error (I have not tested the code, so I do not know if there are other errors), I do not think it will fix the error of the extraneous "ct" being prepended to the "Subject String" lines in the output at the top of page 289. That error, I believe, is caused by another faulty regular expression at the very end of the code; in particular, the line: $subject =~ s/[^acgt]//g;. As you can see, this line will NOT remove c's and t's from the long, concatenated "Sbjct:" line created from the HSP hash table. Hence, the multiple occurrences of "ct" in the output.	Anonymous