This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
230
|
Chapter 13: NCBI-BLAST Reference
formatdb -i db -o T -V t
formatdb -i db -o -V
formatdb -idb -ot -VT
formatdb -idb -o -V
The following command, however, is illegal because it tries to set -o to a value of V.
formatdb -idb -oV
blastall Parameters
blastall is controlled by several parameters. Many of the parameters have default set-
tings and don’t need to be explicitly assigned. Consider this simple command:
blastall -p blastp
Behind the scenes, this command is converted to:
blastall -p blastp -d nr -i stdin -e 10 -m 0 -o stdout -F T -G 11 -E 2 -X 15 -v 500
-b 250 -f 11 -g T -a 1 -M BLOSUM62 -W 3 -z 0 -K 0 -Y 0 -T F -U F -y 0.0 -Z 0 -A 40
You can see that many parameters are set without your express knowledge. These
parameters affect the results of your experiment and, as reinforced many times
throughout the book, you should try to understand these parameters and set them to
fit each experiment.
The following reference section explains all the parameters available for blastall and
lists the default values that are used if not explicitly set. The table was compiled
according to the default values for the five basic programs. Although megablast can
be run from within blastall (
-n T), you should use the standalone program. The
parameters for it are presented later in the chapter.
-a [integer]
Default: 1
Programs: All
Sets the number of processors to use on of processors. If you have multiple queries, you
will get better throughput by executing multiple BLAST searches. For insensitive searches
such as default BLASTN, setting
-a to a higher value may not appreciably improve speed if
disk I/O is the bottleneck.
-A [integer]
Default: blastn 0, others 40
Programs: All
Sets the multiple-hit window size. When BLAST is set to two-hit mode, this option requires
two word hits on the same diagonal to be within
[integer] letters of each other in order to
extend from either one. The larger the
[integer], the more sensitive BLAST will be. Setting
[integer] to 0 sets the default behavior of 40, except for blastn, whose default is single
word hit. To specify one-hit behavior, set -
P1.
This is the Title of the Book, eMatter Edition
Copyright © 2012 O’Reilly & Associates, Inc. All rights reserved.
blastall Parameters
|
231
-b [integer]
Default: 250
Programs: All
Truncates the report to [integer] number of alignments. There is no warning when you
exceed this limit, so it’s generally a good idea to set
[integer] very high unless you’re inter-
ested only in the top hits.
-B [integer]
Default: Optional
Programs: blastn, tblastn
Sets the number of queries to concatenate in a single search. Concatenating queries acceler-
ates the search because the database is scanned just one time. This is the principle
underlying megablast, but the implementation is different in blastall.
This option is new in Version 2.2.6 and still experimental. The specified
[integer] must be
the number of sequences in the query file. If it’s less, only the first set of
[integer]
sequences is used. Also, the output is very different than you would expect. All the query
names are listed, and then all the one-line summaries are given, followed by the align-
ments, and finally, one footer is produced for the whole report. Given this format, it’s very
difficult to discern which alignments belong to which query. This option should not be
used in its current implementation.
-d [database]
Default: nr
Programs: All
Identifies the database to search. [database] must already be formatted by formatdb.
BLAST looks for
[database] in the following order: the local directory, the BLASTDB envi-
ronment variable (Unix only), and finally, the location specified in the .ncbirc file.
You can merge multiple databases into a single virtual database by putting the individual
databases in quotes. For example, to merge the nt and est databases, use:
-d "nt est". You
can’t mix nucleotide and amino acid databases. The statistics reported are based on the
sizes of the combined databases. Virtual databases may exceed file size limits imposed by
the operating system.
-D [1..23]
Default: 1
Programs: tblastn, tblastx
The genetic code to use for translation of the database nucleotide sequence. See http://
www.ncbi.nlm.nih.gov/htbin-post/Taxonomy for updates.
Options
1 Standard Nuclear Genetic Code
2 Vertebrate Mitochondrial
3 Yeast Mitochondrial
4 Mold, Protozoan, and Coelocoel Mitochondrial

Get BLAST now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.