Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
David Haussler, Michael Kearns, Robert E. Schapire, University of California, Santa Cruz. Computer Research Laboratory
University of California, Santa Cruz, Computer Research Laboratory, 1991 - Machine learning - 29 pages
Abstract: "In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend on properties of both the prior distribution over concepts and the sequence of instances seen by the learner, and to smmoothly unite in a common framework the popular statistical physics and VC dimension theories of learning curves. To achieve this, we undertake a systematic investigation and comparison of two fundamental quantities in learning and information theory: the probability of an incorrect prediction for an optimal learning algorithm, and the Shannon information gain. This study leads to a new understanding of the sample complexity of learning in several existing models."
What people are saying - Write a review
We haven't found any reviews in the usual places.
average-case Bayes algorithm Bayes and Gibbs Bayesian bound given Chervonenkis chosen randomly according columns combinatorial parameters computational learning theory concept class consistent hypothesis cumulative information gain curves and cumulative define denote depend on properties distribution Q drawn independently drawn randomly entropy Equation 21 expected information gain expected total number f(xm fixed Gibbs algorithm Haussler homogeneous linear threshold independently with replacement indicator functions information theory information-theoretic instance sequence instance space distribution instances xi instantaneous information gain instantaneous mistake bounds instantaneous mistake probabilities learner learning curves lemma linear threshold functions Littlestone and Warmuth lower bounds mistake weight neural networks number of mistakes obtain perceived prior possible target concepts posterior probability predict incorrectly prior distribution probability of mistake random without replacement result sample complexity Santa Cruz Section sequence of instances statistical physics uniform convergence upper bounds Vapnik VC dimension analysis VC dimension theory version space Vm(f volume ratio Warmuth 17 xm+i