BLAST

Front Cover
"O'Reilly Media, Inc.", Jul 29, 2003 - Computers - 339 pages
3 Reviews
Sequence similarity is a powerful tool for discovering biological function. Just as the ancient Greeks used comparative anatomy to understand the human body and linguists used the Rosetta stone to decipher Egyptian hieroglyphs, today we can use comparative sequence analysis to understand genomes. BLAST (Basic Local Alignment Search Tool), is a sophisticated software package for rapid searching of nucleotide and protein databases. It is one of the most important software packages used in sequence analysis and bioinformatics. Most users of BLAST, however, seldom move beyond the program's default parameters, and never take advantage of its full power. BLAST is the only book completely devoted to this popular suite of tools. It offers biologists, computational biology students, and bioinformatics professionals a clear understanding of BLAST as well as the science it supports. This book shows you how to move beyond the default parameters, get specific answers using BLAST, and how to interpret your results. The book also contains tutorial and reference sections covering NCBI-BLAST and WU-BLAST, background material to help you understand the statistics behind BLAST, Perl scripts to help you prepare your data and analyze your results, and a wealth of tips and tricks for configuring BLAST to meet your own research needs. Some of the topics covered include:
  • BLAST basics and the NCBI web interface
  • How to select appropriate search parameters
  • BLAST programs: BLASTN, BLASTP, BLASTX, TBLASTN, TBLASTX, PHI-BLAST, and PSI BLAST
  • Detailed BLAST references, including NCBI-BLAST and WU-BLAST
  • Understanding biological sequences
  • Sequence similarity, homology, scoring matrices, scores, and evolution
  • Sequence Alignment
  • Calculating BLAST statistics
  • Industrial-strength BLAST, including developing applications with Perl and BLAST
BLAST is the only comprehensive reference with detailed, accurate information on optimizing BLAST searches for high-throughput sequence analysis. This is a book that any biologist should own.
  

What people are saying - Write a review

User Review - Flag as inappropriate

This is a one stop shop for all your BLAST queries. From the statistics behind BLAST to the different flavors of BLAST, this book is an excellent guide to BLAST. This is a reference book every bioinfomatician should read. There is no better guide to BLAST available elsewhere; not even on the world wide web !! 

Selected pages

Contents

Hello BLAST
3
Using NCBIBLAST
4
Alternate Output Formats
12
Alternate Alignment Views
13
The Next Step
14
Further Reading
15
Theory
17
Biological Sequences
19
BLASTN Protocols
131
BLASTP Protocols
144
BLASTX Protocols
147
TBLASTN Protocols
152
TBLASTX Protocols
155
IndustrialStrength BLAST
159
Installation and CommandLine Tutorial
161
WUBLAST Installation
166

Evolution
27
Genomes and Genes
35
Biological Sequences and Similarity
38
Further Reading
39
Sequence Alignment
40
SmithWaterman
46
Dynamic Programming
50
Variations
51
Final Thoughts
53
Sequence Similarity
55
Amino Acid Similarity
57
Scoring Matrices
59
Target Frequencies lambda and H
60
Sequence Similarity
64
KarlinAltschul Statistics
65
Sum Statistics and Sum Scores
67
Further Reading
70
Practice
73
BLAST
75
The BLAST Algorithm
76
Further Reading
87
Anatomy of a BLAST Report
88
A BLAST Statistics Tutorial
96
Using Statistics to Understand BLAST Results
109
20 Tips to Improve Your BLAST Searches
116
83 Perform Controls Especially in the Twilight Zone
117
84 View BLAST Reports Graphically
118
85 Use the KarlinAltschul Equation to Design Experiments
119
87 Know When to Use Complexity Filters
120
88 Mask Repeats in Genomic DNA
121
810 Be Skeptical of Hypothetical Proteins
123
812 Use Caution When Searching Raw Sequencing Reads
124
815 Look for Gaps in Coverage as a Sign of Missed Exons
126
817 Perform Pilot Experiments
128
BLAST Protocols
130
CommandLine Tutorial
170
Editing Scoring Matrices
186
BLAST Databases
188
BLAST Databases
193
Sequence Databases
198
Sequence Database Management Strategies
206
Hardware and Software Optimizations
213
CPUs and Computer Architecture
215
Compute Clusters
216
Distributed Resource Management
218
Software Tricks
220
Optimized NCBIBLAST
224
BLAST Reference
227
NCBIBLAST Reference
229
blastall Parameters
230
formatdb Parameters
240
fastacmd Parameters
242
megablast Parameters
245
bl2seq Parameters
252
blastpgp Parameters PSIBLAST and PHIBLAST
256
blastclust Parameters
264
WUBLAST Reference
267
Usage Statements
268
WUBLAST Parameters
269
xdformat Parameters
281
xdget Parameters
285
Appendixes
289
NCBI Display Formats
291
Nucleotide Scoring Schemes
299
NCBIBLAST Scoring Schemes
302
blastimagerpl
305
blast2tablepl
309
Glossary
313
Index
319
Copyright

Common terms and phrases

References to this book

About the author (2003)

Ian Korf received his B.A. from Cornell University and his Ph.D from Indiana University. His formal training is in molecular biology but he has had a fondness for computer programming since his early teens. His post-doctoral research at Washington University in St. Louis and at The Wellcome Trust Sanger Institute in the U.K. has focused on genomic sequence analysis with an emphasis on comparative genomics and gene prediction. His goal in life is to follow genomes, wherever they happen to take him.

Mark Yandell received his PhD in Molecular, Cellular and Developmental Biology from the University of Colorado, Boulder. After graduation, he joined the Genome Sequencing Center at Washington University, where he pursued post-doctoral studies in computational biology, genome annotation and SNP discovery. In 1999 he joined Celera Genomics, where he wrote much of the software used by Celera to annotate and analyze the drosophila, human, mouse and mosquito genomes. He recently joined the Berkeley Drosophila Genome Project.

Joseph Bedell received his B.S. in Genetics from the University of Georgia in 1991 then worked on mosquito genetics at the Centers for Disease Control and Prevention in Atlanta. He went on to complete a Ph.D. in human genetics at the University of California, Irvine in 1999. Joseph, like his co-authors, completed a post-doc in mammalian gene annotation with Warren Gish, one of the original developers of BLAST. He is currently the Director of Bioinformatics for Orion Genomics in St. Louis where he spends his days (and nights) using BLAST to answer important biological and phylogenetic questions in plants.