Statistical Reinforcement Learning: Modern Machine Learning ApproachesReinforcement learning (RL) is a framework for decision making in unknown environments based on a large amount of data. Several practical RL applications for business intelligence, plant control, and gaming have been successfully explored in recent years. Providing an accessible introduction to the field, this book covers model-based and model-free approaches, policy iteration, and policy search methods. It presents illustrative examples and state-of-the-art results, including dimensionality reduction in RL and risk-sensitive RL. The book provides a bridge between RL and data mining and machine learning research. |
Other editions - View all
Statistical Reinforcement Learning: Modern Machine Learning Approaches Masashi Sugiyama No preview available - 2020 |
Statistical Reinforcement Learning: Modern Machine Learning Approaches Masashi Sugiyama No preview available - 2015 |
Common terms and phrases
absolute loss action space active learning algorithm argmax at,n average baseline subtraction basis functions Bellman equation brush agent Chapter computed conditional density estimation cross-validation data samples dataset defined denotes dimensionality reduction Dirac's delta function direct policy search end effector expected return flattening parameter footprint Gaussian kernels GGKs gradient estimators Grassmann manifold illustrated in Figure immediate reward importance weight IW-PGPE IW-PGPE-OB joint Khepera Khepera robot least-squares policy iteration LSCDE machine learning matrix medial axis method model-free Motion examples natural gradients number of episodes off-policy OGKs optimal baseline performance PGPE PGPE-OB policy evaluation policy parameter policy update policy-prior search probability density problem reinforcement learning robot agent sample reuse sampling policy Section SRPI St+1 standard deviation state-action value function stochastic Sugiyama supervised learning Total number training samples trajectory length trajectory samples transition model value function approximation variance of gradient vector Wmax wmin Пе