## Bounds on the sample complexity of Bayesian learning using information theory and the VC dimensionDavid Haussler, Michael Kearns, Robert E. Schapire, University of California, Santa Cruz. Computer Research Laboratory University of California, Santa Cruz, Computer Research Laboratory, 1991 - Machine learning - 29 pages Abstract: "In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend on properties of both the prior distribution over concepts and the sequence of instances seen by the learner, and to smmoothly unite in a common framework the popular statistical physics and VC dimension theories of learning curves. To achieve this, we undertake a systematic investigation and comparison of two fundamental quantities in learning and information theory: the probability of an incorrect prediction for an optimal learning algorithm, and the Shannon information gain. This study leads to a new understanding of the sample complexity of learning in several existing models." |

### What people are saying - Write a review

We haven't found any reviews in the usual places.

### Common terms and phrases

average-case Bayes algorithm Bayes and Gibbs Bayesian bound given Chervonenkis chosen randomly according columns combinatorial parameters computational learning theory concept class consistent hypothesis cumulative information gain curves and cumulative define denote depend on properties distribution Q drawn independently drawn randomly entropy Equation 21 expected information gain expected total number f(xm fixed Gibbs algorithm Haussler homogeneous linear threshold independently with replacement indicator functions information theory information-theoretic instance sequence instance space distribution instances xi instantaneous information gain instantaneous mistake bounds instantaneous mistake probabilities learner learning curves lemma linear threshold functions Littlestone and Warmuth lower bounds mistake weight neural networks number of mistakes obtain perceived prior possible target concepts posterior probability predict incorrectly prior distribution probability of mistake random without replacement result sample complexity Santa Cruz Section sequence of instances statistical physics uniform convergence upper bounds Vapnik VC dimension analysis VC dimension theory version space Vm(f volume ratio Warmuth 17 xm+i