Sebastopol, CA--You can distinguish a vulture from a hawk by its dihedral wing profile. The mockingbird is easily differentiated from similarly sized birds by the flash of white on its wings and its swift, aerodynamic swoop from tree to tree. In the field, quick identification is made possible with the use of field marks: know the field marks and you'll know your bird. Likewise, you can recognize the bioinformatician involved in sequence analysis by the surrounding clutter of well-thumbed hard copies of readme files that help him or her to make sense of the plethora of flat file formats used to process sequence entries and to remember what the specific field codes mean. With the wealth of tools available, often the bioinformatician's greatest challenge is keeping them straight. But this is soon to change. The release of Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases by Scott Markel and Darryl Leon (O'Reilly, US $29.95) does away, once and for all, with the stacks of notes and printed reference papers. This book brings together all of the vital information about the most commonly used databases, analytical tools, and tables used in sequence analysis into one handy reference guide.
Gene sequence data is the most abundant type of data available, and there is a rich array of computational methods and tools that can help analyze patterns within that data. "Sequence Analysis in a Nutshell" pulls together the detailed terms, definitions, and command-line options found in the key databases and tools used in sequence analysis. The book is partitioned into three fundamental areas to help bioinformaticians maximize their use of the content. The first section, "Databases," contains examples of flat files from key databases (GenBank, EMBL, DDBJ, Pfam, PROSITE, and SWISS-PROT), the definitions of the codes or fields used in each database, and the sequence feature types/terms and qualifiers for the nucleotide and protein databases.
The second section, "Tools," provides the command-line syntax for popular applications such as Readseq, MEME/MAST, BLAST, ClustalW, and the EMBOSS suite of analytical tools. The third section, "Appendixes," concentrates on information essential to understanding the individual components that make up a biological sequence. The tables in this section include nucleotide and protein codes, genetic codes, as well as other relevant information.
Written in O'Reilly's enormously popular, straightforward "In a Nutshell" format, this book provides essential information for bioinformaticians in industry and academia, as well as for students. "Sequence Analysis in a Nutshell" is a handy resource and an invaluable reference for anyone who needs to know about the practical aspects and mechanics of sequence analysis.
Chapter 7, BLAST, is available free online
Sequence Analysis in a Nutshell
Scott Markel and Darryl Leon
ISBN 0-596-00494-x, 284 pages, $29.95 (US), $46.95 (CAN), 20.95 (UK)
O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption by amplifying "faint signals" from the alpha geeks who are creating the future. An active participant in the technology community, the company has a long history of advocacy, meme-making, and evangelism.
PRESS QUERIES ONLY