## Pattern Recognition and Neural NetworksThis 1996 book is a reliable account of the statistical framework for pattern recognition and machine learning. With unparalleled coverage and a wealth of case-studies this book gives valuable insight into both the theory and the enormously diverse applications (which can be found in remote sensing, astrophysics, engineering and medicine, for example). So that readers can develop their skills and understanding, many of the real data sets used in the book are available from the author's website: www.stats.ox.ac.uk/~ripley/PRbook/. For the same reason, many examples are included to illustrate real problems in pattern recognition. Unifying principles are highlighted, and the author gives an overview of the state of the subject, making the book valuable to experienced researchers in statistics, machine learning/artificial intelligence and engineering. The clear writing style means that the book is also a superb introduction for non-specialists. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Introduction and Examples | 1 |

11 How do neural methods differ? | 4 |

12 The pattern recognition task | 5 |

13 Overview of the remaining chapters | 9 |

14 Examples | 10 |

15 Literature | 15 |

Statistical Decision Theory | 17 |

21 Bayes rules for known distributions | 18 |

63 Learning vector quantization | 201 |

64 Mixture representations | 207 |

Treestructured Classifiers | 213 |

71 Splitting rules | 216 |

72 Pruning rules | 221 |

73 Missing values | 231 |

74 Earlier approaches | 235 |

75 Refinements | 237 |

22 Parametric models | 26 |

23 Logistic discrimination | 43 |

24 Predictive classification | 45 |

25 Alternative estimation procedures | 55 |

26 How complex a model do we need? | 59 |

27 Performance assessment | 66 |

28 Computational learning approaches | 77 |

Linear Discriminant Analysis | 91 |

31 Classical linear discrimination | 92 |

32 Linear discriminants via regression | 101 |

33 Robustness | 105 |

34 Shrinkage methods | 106 |

35 Logistic discrimination | 109 |

36 Linear separation and perceptrons | 116 |

Flexible Discriminants | 121 |

41 Fitting smooth parametric functions | 122 |

42 Radial basis functions | 131 |

43 Regularization | 136 |

Feedforward Neural Networks | 143 |

51 Biological motivation | 145 |

52 Theory | 147 |

53 Learning algorithms | 148 |

54 Examples | 160 |

55 Bayesian perspectives | 163 |

56 Network complexity | 168 |

57 Approximation results | 173 |

Nonparametric Methods | 181 |

62 Nearest neighbour methods | 191 |

76 Relationships to neural networks | 240 |

77 Bayesian trees | 241 |

Belief Networks | 243 |

81 Graphical models and networks | 246 |

82 Causal networks | 262 |

83 Learning the network structure | 275 |

84 Boltzmann machines | 279 |

85 Hierarchical mixtures of experts | 283 |

Unsupervised Methods | 287 |

91 Projection methods | 288 |

92 Multidimensional scaling | 305 |

93 Clustering algorithms | 311 |

94 Selforganizing maps | 322 |

Finding Good Pattern Features | 327 |

101 Bounds for the Bayes error | 328 |

102 Normal class distributions | 329 |

103 Branchandbound techniques | 330 |

104 Feature extraction | 331 |

Statistical Sidelines | 333 |

A2 The EM algorithm | 334 |

A3 Markov chain Monte Carlo | 337 |

A4 Axioms for conditional independence | 339 |

A5 Optimization | 342 |

Glossary | 347 |

References | 355 |

391 | |

399 | |

### Other editions - View all

### Common terms and phrases

algorithm applied approach approximation asymptotic average Bayes risk Bayes rule Bayesian binary bound Breiman choose class densities classifier clique clusters conditional independence consider convergence covariance matrix cross-validation Cushing's syndrome dataset density estimation deviance dimension dissimilarity distance error rate example Figure Gibbs sampler gives hidden layer hidden units IEEE Transactions inputs iterative Journal kernel Kohonen linear combination linear discriminant log-likelihood logistic Machine Learning Mahalanobis distance marginal Markov Markov property maximize maximum likelihood measure methods minimize mixture moral graph multivariate neighbour Neural Computation neural networks node non-linear optimal outliers parameters pattern recognition perceptron plug-in posterior probabilities predictive principal components prior problem procedure projection pursuit Proposition pruning quadratic random variables regression sample Section shows smoothing splines split Statistical subset Suppose test set theory training set tree update values variance VC dimension vertex vertices weight decay WinF zero