
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
120
|
Chapter 8: 20 Tips to Improve Your BLAST Searches
• What are the values for the seeding parameters W, T, and two-hit distance? If
the seeding parameters are too stringent, divergent alignments may not be
seeded. In NCBI-BLAST, W is unfortunately not displayed in the footer. The
value for T and two-hit distance are given as
T: and A:, respectively.
• What is the scoring scheme expecting to find (i.e., target frequency)? If the scor-
ing matrix expects nearly identical sequences, highly divergent sequences may be
missed.
• What is the alignment threshold? If the alignment threshold is too high, low
scoring alignments will be thrown away. The gapped and ungapped values are
given after
S1: and S2: in NCBI-BLAST. In WU-BLAST, they are on the rows
beneath S2.
• What are B and V set to? If they are set too low, the number of one-line summa-
ries and database hits may be truncated.
• What is the score and expected length of a significant alignment? Use the Karlin-
Altschul equation to solve for the normalized score and then divide by H to cal-
culate the length.
• Was complexity filtering employed, and if so, was it hard or soft? Complexity fil-
tering is generally a good idea, but may prevent some sequences from generating
significant alignments. NCBI-BLAST doesn’t not currently report which ...