Pattern classificationThe first edition, published in 1973, has become a classic reference in the field. Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics. An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department. 
From inside the book
Results 13 of 79
Page 128
Now we turn to the problem of making a sequence of decisions. In problems that
have an inherent temporality — that is, consist of a process that unfolds in time —
we may have states at time t that are influenced directly by a state at t — 1.
Hidden Markov models (HMMs) have found greatest use in such problems — for
instance, speech recognition or gesture recognition. While the notation and
description is unavoidably more complicated than the simpler models considered
up to this ...
Now we turn to the problem of making a sequence of decisions. In problems that
have an inherent temporality — that is, consist of a process that unfolds in time —
we may have states at time t that are influenced directly by a state at t — 1.
Hidden Markov models (HMMs) have found greatest use in such problems — for
instance, speech recognition or gesture recognition. While the notation and
description is unavoidably more complicated than the simpler models considered
up to this ...
Page 131
3.10.4 Evaluation The probability that the model produces a sequence Vr of
visible states is P(Vr) = J] P(VT\o>J)P(oJ), (132) r = l where each r indexes a
particular sequence toj = (a>(l), a,(2), . . . , co(T)} of T hidden states. In the general
case of c hidden states, there will be rmax = cT possible terms in the sum of Eq.
132, corresponding to all possible sequences of length T . Thus, according to Eq.
132, in order to compute the probability that the model generated the particular
sequence ...
3.10.4 Evaluation The probability that the model produces a sequence Vr of
visible states is P(Vr) = J] P(VT\o>J)P(oJ), (132) r = l where each r indexes a
particular sequence toj = (a>(l), a,(2), . . . , co(T)} of T hidden states. In the general
case of c hidden states, there will be rmax = cT possible terms in the sum of Eq.
132, corresponding to all possible sequences of length T . Thus, according to Eq.
132, in order to compute the probability that the model generated the particular
sequence ...
Page 137
Moreover, using postprocessing, we can delete repeated states and just get the
sequence somewhat independent of variations in rate. Thus in postprocessing
we can convert the sequence {co\, co\, cos, co2, a>2, 02) to [co\, w3,a>i}, which
would be appropriate for speech recognition, where the fundamental phonetic
units are not repeated in natural speech. 3.10.6 Learning The goal in HMM
learning is to determine model parameters — the transition probabilities ay and
bjk — from an ...
Moreover, using postprocessing, we can delete repeated states and just get the
sequence somewhat independent of variations in rate. Thus in postprocessing
we can convert the sequence {co\, co\, cos, co2, a>2, 02) to [co\, w3,a>i}, which
would be appropriate for speech recognition, where the fundamental phonetic
units are not repeated in natural speech. 3.10.6 Learning The goal in HMM
learning is to determine model parameters — the transition probabilities ay and
bjk — from an ...
What people are saying  Write a review
User ratings
5 stars 
 
4 stars 
 
3 stars 
 
2 stars 
 
1 star 

User Review  Flag as inappropriate
excelent
User Review  Flag as inappropriate
it doesn't describe stuff so well so it's hard to understand it. it covers many topics with little description.
Contents
A  1 
MAXIMUMLIKELIHOOD AND BAYESIAN  84 
4 NONPARAMETRIC TECHNIQUES  161 
Copyright  
15 other sections not shown
Other editions  View all
Common terms and phrases
annealing applied approach arbitrary assume backpropagation Bayes Bayesian bias binary calculate Chapter clusters component classifiers Computer exercise configuration Consider convergence corresponding covariance matrix criterion function data set decision boundary decision rule denote derivation dimensional dimensions discriminant function distance distribution entropy equation error rate example feature space FIGURE Gaussian given gradient descent grammar Hessian matrix Hidden Markov Models hidden units hyperplane impurity independent iteration labeled large number learning rate linear discriminant linearly separable maximumlikelihood estimate mean methods minimize minimum mixture density nearestneighbor neural networks node nonlinear normal number of samples obtain optimal output units P(co parameters particular pattern recognition Perceptron posterior posterior probabilities prior probabilities problem procedure randomly Section sequence Show shown simple solution split statistical statistically independent stochastic Suppose Theorem tion training data training error training patterns training samples training set tree twocategory unsupervised learning variance weight vector zero