Bandit Problems: Sequential Allocation of Experiments
Notation and preliminaries; The discount sequence; Independent Bernoulli arms; Two arms, one arm known; Many independent arms; geometric discounting; Two independent Bernoulli arms; uniform discounting; Continuous-time bandits; Minimax approach.
32 pages matching defined in this book
Results 1-3 of 32
What people are saying - Write a review
We haven't found any reviews in the usual places.
Notation and preliminaries
The discount sequence
Independent Bernoulli arms
8 other sections not shown
a-field A)-bandit assume Bather Berry and Fristedt beta distribution bound Bradt Brownian motion calculate Chapter condition considered continuous-time convergence in distribution Corollary decision maker decision problem defined denote depend design of experiments discount factors discount function distribution F dynamic programming equals equation example finite horizon geometric discounting given gives independent Bernoulli arms indicates arm indicates the arm inequality Karlin known arm Lemma Levy processes Math maximize minimax mixed strategy myopic strategies n-horizon uniform nonincreasing nonnegative normal distribution notation number of successes observe arm observed at stage obtained one-point distribution optimal at stage optimal initial selection optimal strategy payoff proof of Theorem pure strategies random variables regular discount sequence Related references result right-hand side Robbins Section 8.1 select arm selection of arm sequential design setting Statist stochastic stopping problem Suppose supremum Theodorescu two-armed bandit problem uniform discounting uniquely optimal unknown arm