Statistical Mechanics of Learning

Front Cover
Cambridge University Press, Mar 29, 2001 - Computers - 329 pages
Learning is one of the things that humans do naturally, and it has always been a challenge for us to understand the process. Nowadays this challenge has another dimension as we try to build machines that are able to learn and to undertake tasks such as datamining, image processing and pattern recognition. We can formulate a simple framework, artificial neural networks, in which learning from examples may be described and understood. The contribution to this subject made over the last decade by researchers applying the techniques of statistical mechanics is the subject of this book. The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers, supplement this with background material in mathematics and physics and include many examples and exercises to make a book that can be used with courses, or for self-teaching, or as a handy reference.
 

What people are saying - Write a review

We haven't found any reviews in the usual places.

Contents

Getting Started
3
12 A simple example
6
13 General setup
10
14 Problems
15
Perceptron Learning Basics
16
22 The annealed approximation
20
23 The Gardner analysis
24
24 Summary
29
93 Optimal online learning
157
94 Perceptron with a smooth transfer function
161
95 Queries
162
96 Unsupervised online learning
167
97 The natural gradient
171
98 Discussion
172
99 Problems
173
Making Contact with Statistics
178

25 Problems
31
A Choice of Learning Rules
35
32 The perceptron rule
38
33 The pseudoinverse rule
39
34 The adaline rule
41
35 Maximal stability
42
36 The Bayes rule
44
37 Summary
48
Augmented Statistical Mechanics Formulation
51
42 Gibbs learning at nonzero temperature
54
43 General statistical mechanics formulation
58
44 Learning rules revisited
61
45 The optimal potential
65
46 Summary
66
47 Problems
67
Noisy Teachers
71
52 Trying perfect learning
74
53 Learning with errors
80
54 Refinements
82
55 Summary
84
56 Problems
85
The Storage Problem
87
the Cover analysis
91
the Ising perceptron
95
64 The distribution of stabilities
100
65 Beyond the storage capacity
104
66 Problems
106
Discontinuous Learning
111
72 The Ising perceptron
113
73 The reversed wedge perceptron
116
74 The dynamics of discontinuous learning
120
75 Summary
123
76 Problems
124
Unsupervised Learning
127
82 The deceptions of randomness
131
83 Learning a symmetrybreaking direction
135
84 Clustering through competitive learning
139
85 Clustering by tuning the temperature
144
87 Problems
149
Online Learning
151
92 Specific examples
154
102 Sauers lemma
180
103 The VapnikChervonenkis theorem
182
104 Comparison with statistical mechanics
184
105 The CramérRao inequality
188
106 Discussion
191
107 Problems
192
A Birds Eye View Multifractals
195
112 The multifractal spectrum of the perceptron
197
113 The multifractal organization of internal representations
205
114 Discussion
209
Multilayer Networks
211
121 Basic architectures
212
122 Bounds
216
123 The storage problem
220
124 Generalization with a parity tree
224
125 Generalization with a committee tree
227
126 The fully connected committee machine
230
127 Summary
232
128 Problems
234
Online Learning in Multilayer Networks
239
132 The parity tree
245
133 Soft committee machine
248
134 Backpropagation
253
135 Bayesian online learning
255
136 Discussion
257
137 Problems
258
What Else?
261
142 Complex optimization
265
143 Errorcorrecting codes
268
144 Game theory
272
Appendices
277
A2 The Gardner Analysis
284
A3 Convergence of the Perceptron Rule
291
A4 Stability of the Replica Symmetric Saddle Point
293
A5 Onestep Replica Symmetry Breaking
302
A6 The Cavity Approach
306
A7 The VC theorem
312
Bibliography
315
Index
329
Copyright

Other editions - View all

Common terms and phrases

Popular passages

Page 320 - K. Rose, E. Gurewitz, and GC Fox, "Statistical mechanics and phase transitions in clustering," Physical Review Letters, vol.
Page 314 - CT(I>~) and cr(v~) are the frequencies resulting in the two subsamples considered after permuting the examples of the whole sample. Note that the composition of each subsample can be modified by the permutation. It turns out that the quantity can be bounded for all the possible outcomes. We will consider two different bounds for F. The first one is valid for p — p...

Bibliographic information