## Methods for computational gene predictionInferring the precise locations and splicing patterns of genes in DNA is a difficult but important task, with broad applications to biomedicine. The mathematical and statistical techniques that have been applied to this problem are surveyed and organized into a logical framework based on the theory of parsing. Both established approaches and methods at the forefront of current research are discussed. Numerous case studies of existing software systems are provided, in addition to detailed examples that work through the actual implementation of effective gene-predictors using hidden Markov models and other machine-learning techniques. Background material on probability theory, discrete mathematics, computer science, and molecular biology is provided, making the book accessible to students and researchers from across the life and computational sciences. This book is ideal for use in a first course in bioinformatics at graduate or advanced undergraduate level, and for anyone wanting to keep pace with this rapidly-advancing field. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Introduction | 1 |

Mathematical preliminaries | 28 |

Overview of computational gene prediction | 83 |

Copyright | |

11 other sections not shown

### Other editions - View all

### Common terms and phrases

acceptor alignment alternative splicing amino acid annotation argmax cell Chapter classification coding segments comparative gene compute consensus consider content sensors contig corresponding defined denotes described distribution DNA sequence donor dynamic programming edge emission probabilities emitted entropy estimate eukaryotic evaluate example exon feature forward strand function gene finder gene finding gene prediction gene structure genome GENSCAN GHMM GHMM-based given hidden Markov model Implement initial input sequence intergenic interval intron isochore iterations labeled length Markov chain matrix maximize methods mRNA n-gram node noncoding nucleotide optimal ORF graph pair parameters parse path phase PHMM position predecessor prefix sum problem procedure protein putative exon putative signals queue random recursion regions resulting reverse strand sample score signal sensor Sn Sp splice sites statistical stop codons submodel symbol training data training set transition tree typically variable vertex Viterbi Viterbi algorithm weight