Adaptive Representations for Reinforcement Learning
This book presents new algorithms for reinforcement learning, a form of machine learning in which an autonomous agent seeks a control policy for a sequential decision task. Since current methods typically rely on manually designed solution representations, agents that automatically adapt their own representations have the potential to dramatically improve performance. This book introduces two novel approaches for automatically discovering high-performing representations. The first approach synthesizes temporal difference methods, the traditional approach to reinforcement learning, with evolutionary methods, which can learn representations for a broad class of optimization problems. This synthesis is accomplished by customizing evolutionary methods to the on-line nature of reinforcement learning and using them to evolve representations for value function approximators. The second approach automatically learns representations based on piecewise-constant approximations of value functions. It begins with coarse representations and gradually refines them during learning, analyzing the current policy and value function to deduce the best refinements. This book also introduces a novel method for devising input representations. This method addresses the feature selection problem by extending an algorithm that evolves the topology and weights of neural networks such that it evolves their inputs too. In addition to introducing these new methods, this book presents extensive empirical results in multiple domains demonstrating that these techniques can substantially improve performance over methods with manual representations.
What people are saying - Write a review
We haven't found any reviews in the usual places.
OnLine Evolutionary Computation
Evolutionary Function Approximation
SampleEfficient Evolutionary Function Approximation
Automatic Feature Selection for Reinforcement Learning
Other editions - View all
action Adaptive Representations adaptive tile coding annealing approach Artiﬁcial Intelligence backpropagation Baldwin Effect Bellman error best network chapter Conference on Machine conﬁdence Darwinian difﬁcult dynamic programming efﬁcient evaluation evolution and learning Evolutionary algorithms evolutionary function approximation evolutionary methods evolve experiments feature selection feature set ﬁnal ﬁnd ﬁnding ﬁrst ﬁtness function ﬁxed FS-NEAT Genetic Genetic Algorithms genomes Hence hidden nodes International Conference interval estimation job scheduling domains linear Machine Learning Markov decision process maximize Miikkulainen mountain car domain Moving Average Score neural networks neuroevolution off-line on-line evolutionary computation output parameter performance policy criterion policy search methods population puddle world Q-learning regular NEAT reinforcement learning problems reinforcement learning tasks reinforcementlearning Representations for Reinforcement reward accrued robot Score Per Episode server job scheduling signiﬁcant signiﬁcantly softmax NEAT+Q softmax selection speciﬁes split stochastic supervised learning TD methods temporal difference methods tion topology Uniform Moving Average updates weights Whiteson