## Statistical Mechanics of LearningLearning is one of the things that humans do naturally, and it has always been a challenge for us to understand the process. Nowadays this challenge has another dimension as we try to build machines that are able to learn and to undertake tasks such as datamining, image processing and pattern recognition. We can formulate a simple framework, artificial neural networks, in which learning from examples may be described and understood. The contribution to this subject made over the last decade by researchers applying the techniques of statistical mechanics is the subject of this book. The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers, supplement this with background material in mathematics and physics and include many examples and exercises to make a book that can be used with courses, or for self-teaching, or as a handy reference. |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Contents

Getting Started | 1 |

12 A simple example | 4 |

13 General setup | 8 |

14 Problems | 13 |

Perceptron Learning Basics | 14 |

22 The annealed approximation | 18 |

23 The Gardner analysis | 22 |

24 Summary | 27 |

93 Optimal online learning | 155 |

94 Perceptron with a smooth transfer function | 159 |

95 Queries | 160 |

96 Unsupervised online learning | 165 |

97 The natural gradient | 169 |

98 Discussion | 170 |

99 Problems | 171 |

Making Contact with Statistics | 176 |

25 Problems | 29 |

A Choice of Learning Rules | 33 |

32 The perceptron rule | 36 |

33 The pseudoinverse rule | 37 |

34 The adaline rule | 39 |

35 Maximal stability | 40 |

36 The Bayes rule | 42 |

37 Summary | 46 |

Augmented Statistical Mechanics Formulation | 49 |

42 Gibbs learning at nonzero temperature | 52 |

43 General statistical mechanics formulation | 56 |

44 Learning rules revisited | 59 |

45 The optimal potential | 63 |

46 Summary | 64 |

47 Problems | 65 |

Noisy Teachers | 69 |

52 Trying perfect learning | 72 |

53 Learning with errors | 78 |

54 Refinements | 80 |

55 Summary | 82 |

56 Problems | 83 |

The Storage Problem | 85 |

the Cover analysis | 89 |

the Ising perceptron | 93 |

64 The distribution of stabilities | 98 |

65 Beyond the storage capacity | 102 |

66 Problems | 104 |

Discontinuous Learning | 109 |

72 The Ising perceptron | 111 |

73 The reversed wedge perceptron | 114 |

74 The dynamics of discontinuous learning | 118 |

75 Summary | 121 |

76 Problems | 122 |

Unsupervised Learning | 125 |

82 The deceptions of randomness | 129 |

83 Learning a symmetrybreaking direction | 133 |

84 Clustering through competitive learning | 137 |

85 Clustering by tuning the temperature | 142 |

87 Problems | 147 |

Online Learning | 149 |

92 Specific examples | 152 |

102 Sauers lemma | 178 |

103 The VapnikChervonenkis theorem | 180 |

104 Comparison with statistical mechanics | 182 |

105 The CramérRao inequality | 186 |

106 Discussion | 189 |

107 Problems | 190 |

A Birds Eye View Multifractals | 193 |

112 The multifractal spectrum of the perceptron | 195 |

113 The multifractal organization of internal representations | 203 |

114 Discussion | 207 |

Multilayer Networks | 209 |

121 Basic architectures | 210 |

122 Bounds | 214 |

123 The storage problem | 218 |

124 Generalization with a parity tree | 222 |

125 Generalization with a committee tree | 225 |

126 The fully connected committee machine | 228 |

127 Summary | 230 |

128 Problems | 232 |

Online Learning in Multilayer Networks | 237 |

132 The parity tree | 243 |

133 Soft committee machine | 246 |

134 Backpropagation | 251 |

135 Bayesian online learning | 253 |

136 Discussion | 255 |

137 Problems | 256 |

What Else? | 259 |

142 Complex optimization | 263 |

143 Errorcorrecting codes | 266 |

144 Game theory | 270 |

Appendices | 275 |

A2 The Gardner Analysis | 282 |

A3 Convergence of the Perceptron Rule | 289 |

A4 Stability of the Replica Symmetric Saddle Point | 291 |

A5 Onestep Replica Symmetry Breaking | 300 |

A6 The Cavity Approach | 304 |

A7 The VC theorem | 310 |

313 | |

327 | |

### Other editions - View all

### Common terms and phrases

algorithm analysis application approach approximation architecture assume asymptotic average becomes behaviour bound calculation cells chapter characterized choice classification connected Consider correct corresponding cost coupling vector defined depends derived described detailed determined direction discussed distribution energy entropy equal equations error examples expression fact finally function Gaussian Gibbs given gives hand Hebb hence hidden units input integral interesting introduced Ising learning learning rules limit machine maximal method minimal networks neural noise Note observed obtained on-line learning optimal output overlap parameters particular perceptron performance possible present probability problem properties quantity random realize replica symmetry representations respect result rule saddle point scenario Show shown similar simple solution stability statistical mechanics storage capacity student symmetry teacher training error training set transition tree turns typical variables version space volume zero

### Popular passages

Page 318 - K. Rose, E. Gurewitz, and GC Fox, "Statistical mechanics and phase transitions in clustering," Physical Review Letters, vol.