Statistical and Inductive Inference by Minimum Message LengthMythanksareduetothemanypeoplewhohaveassistedintheworkreported here and in the preparation of this book. The work is incomplete and this account of it rougher than it might be. Such virtues as it has owe much to others; the faults are all mine. MyworkleadingtothisbookbeganwhenDavidBoultonandIattempted to develop a method for intrinsic classi?cation. Given data on a sample from some population, we aimed to discover whether the population should be considered to be a mixture of di?erent types, classes or species of thing, and, if so, how many classes were present, what each class looked like, and which things in the sample belonged to which class. I saw the problem as one of Bayesian inference, but with prior probability densities replaced by discrete probabilities re?ecting the precision to which the data would allow parameters to be estimated. Boulton, however, proposed that a classi?cation of the sample was a way of brie?y encoding the data: once each class was described and each thing assigned to a class, the data for a thing would be partially implied by the characteristics of its class, and hence require little further description. After some weeks’ arguing our cases, we decided on the maths for each approach, and soon discovered they gave essentially the same results. Without Boulton’s insight, we may never have made the connection between inference and brief encoding, which is the heart of this work. |
Contents
1 | |
2 | |
Strict Minimum Message Length SMML | 143 |
Approximations to SMML | 197 |
Structural Models 305 | 304 |
The Feathers on the Arrow of Time | 337 |
Related Work | 401 |
416 | |
Other editions - View all
Statistical and Inductive Inference by Minimum Message Length C.S. Wallace No preview available - 2010 |
Common terms and phrases
acceptable algorithm approximation assertion assumed atoms Bayesian binary digits binary string choice choose class labels code string coding probability colour complex component computable conjugate prior construction data group data string data values decoded deduction defined depend detail length deterministic discrete entropy ETM1 expected explanation length explanation message finite Fisher Information function given data gives Hence Huffman code hypothesis I₁ implied inductive inference inductive reasoning input integer Jeffreys prior laws likelihood function log-likelihood Maximum Likelihood mean message length minimize Minimum Message Length MML estimate model class node Normal Normal distribution observed optimal code output parameters past posterior distribution posterior probability Pr(x precision prediction present view prior density prior knowledge prior probability prior probability distribution probability distribution problem proposition range receiver region sample Section sequence set of possible simulation SMML code space specified subset sufficient statistic theory tree vector velocity zero