## Statistical Mechanics of LearningThe effort to build machines that are able to learn and undertake tasks such as datamining, image processing and pattern recognition has led to the development of artificial neural networks in which learning from examples may be described and understood. The contribution to this subject made over the past decade by researchers applying the techniques of statistical mechanics is the subject of this book. The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers, supplement this with background material in mathematics and physics, and include many examples and exercises. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Getting Started | 1 |

12 A simple example | 4 |

13 General setup | 8 |

14 Problems | 13 |

Perceptron Learning Basics | 14 |

22 The annealed approximation | 18 |

23 The Gardner analysis | 22 |

24 Summary | 27 |

93 Optimal online learning | 155 |

94 Perceptron with a smooth transfer function | 159 |

95 Queries | 160 |

96 Unsupervised online learning | 165 |

97 The natural gradient | 169 |

98 Discussion | 170 |

99 Problems | 171 |

Making Contact with Statistics | 176 |

25 Problems | 29 |

A Choice of Learning Rules | 33 |

32 The perceptron rule | 36 |

33 The pseudoinverse rule | 37 |

34 The adaline rule | 39 |

35 Maximal stability | 40 |

36 The Bayes rule | 42 |

37 Summary | 46 |

Augmented Statistical Mechanics Formulation | 49 |

42 Gibbs learning at nonzero temperature | 52 |

43 General statistical mechanics formulation | 56 |

44 Learning rules revisited | 59 |

45 The optimal potential | 63 |

46 Summary | 64 |

47 Problems | 65 |

Noisy Teachers | 69 |

52 Trying perfect learning | 72 |

53 Learning with errors | 78 |

54 Refinements | 80 |

55 Summary | 82 |

56 Problems | 83 |

The Storage Problem | 85 |

the Cover analysis | 89 |

the Ising perceptron | 93 |

64 The distribution of stabilities | 98 |

65 Beyond the storage capacity | 102 |

66 Problems | 104 |

Discontinuous Learning | 109 |

72 The Ising perceptron | 111 |

73 The reversed wedge perceptron | 114 |

74 The dynamics of discontinuous learning | 118 |

75 Summary | 121 |

76 Problems | 122 |

Unsupervised Learning | 125 |

82 The deceptions of randomness | 129 |

83 Learning a symmetrybreaking direction | 133 |

84 Clustering through competitive learning | 137 |

85 Clustering by tuning the temperature | 142 |

87 Problems | 147 |

Online Learning | 149 |

92 Specific examples | 152 |

102 Sauers lemma | 178 |

103 The VapnikChervonenkis theorem | 180 |

104 Comparison with statistical mechanics | 182 |

105 The CramérRao inequality | 186 |

106 Discussion | 189 |

107 Problems | 190 |

A Birds Eye View Multifractals | 193 |

112 The multifractal spectrum of the perceptron | 195 |

113 The multifractal organization of internal representations | 203 |

114 Discussion | 207 |

Multilayer Networks | 209 |

121 Basic architectures | 210 |

122 Bounds | 214 |

123 The storage problem | 218 |

124 Generalization with a parity tree | 222 |

125 Generalization with a committee tree | 225 |

126 The fully connected committee machine | 228 |

127 Summary | 230 |

128 Problems | 232 |

Online Learning in Multilayer Networks | 237 |

132 The parity tree | 243 |

133 Soft committee machine | 246 |

134 Backpropagation | 251 |

135 Bayesian online learning | 253 |

136 Discussion | 255 |

137 Problems | 256 |

What Else? | 259 |

142 Complex optimization | 263 |

143 Errorcorrecting codes | 266 |

144 Game theory | 270 |

Appendices | 275 |

A2 The Gardner Analysis | 282 |

A3 Convergence of the Perceptron Rule | 289 |

A4 Stability of the Replica Symmetric Saddle Point | 291 |

A5 Onestep Replica Symmetry Breaking | 300 |

A6 The Cavity Approach | 304 |

A7 The VC theorem | 310 |

313 | |

327 | |

### Other editions - View all

### Common terms and phrases

adatron algorithm analysis annealed annealed approximation ansatz architecture asymptotic behaviour average Bayes Boolean function bound calculation cells chapter characterized classification cluster committee machine committee tree Consider convergence corresponding cost function coupling vector exponentially Fisher information free energy Gaussian Gibbs learning given gives rise Hebb rule hence hidden units hyperplane indicator function input integral internal representations introduced Ising perceptron learning from examples learning rules matrix maximal minimal multifractal spectrum multilayer networks neural networks neurons obtained off-line on-line learning optimal order parameters output noise overlap parity tree perceptron learning perceptron rule performance probability distribution quenched entropy random variables realize replica symmetry breaking replica trick result reversed wedge perceptron saddle point equations self-averaging Show simple stability statistical mechanics storage capacity storage problem student vector symmetry breaking thermodynamic limit training error training set typical unsupervised unsupervised learning VC dimension version space W-sphere zero

### Popular passages

Page 318 - K. Rose, E. Gurewitz, and GC Fox, "Statistical mechanics and phase transitions in clustering," Physical Review Letters, vol.