Handbook of Learning and Approximate Dynamic Programming
WileyPLUS is a research-based online environment for effective teaching and learning. WileyPLUS is packed with interactive study tools and resources-including the complete online textbook-to give your students more value for their money. |
What people are saying - Write a review
We haven't found any reviews in the usual places.
Contents
Foreword | 1 |
Reinforcement Learning and Its Relationship to Supervised Learning | 47 |
ModelBased Adaptive Critic Designs | 65 |
Guidance in the Use of Adaptive Critics for Control | 97 |
Direct Neural Dynamic Programming | 125 |
The Linear Programming Approach to Approximate Dynamic | 153 |
g Discussion | 173 |
Reinforcement Learning in Large HighDimensional State Spaces | 179 |
g Conclusion | 279 |
Hierarchical Approaches to Concurrency Multiagency | 285 |
Learning and Optimization From a System Theoretic Perspective | 311 |
Robust Reinforcement Learning Using IntegralQuadratic | 337 |
Supervised ActorCritic Reinforcement Learning | 359 |
NearOptimal Control Via Reinforcement Learning | 407 |
Multiobjective Control Problems by Reinforcement Learning | 433 |
Adaptive Critic Based Neural Network for ControlConstrained | 463 |
g Hierarchical Decision Making | 203 |
Hierarchical Remforcement Learning in Theory | 209 |
Hierarchical Remforcement Learning in Practice | 217 |
IntraBehavior Learmng | 223 |
Improved Temporal Difference Methods with Linear Function | 235 |
Approximate Dynamic Programming for HighDimensional | 261 |
Applications of Approximate Dynamic Programming in Power Systems | 479 |
Robust Reinforcement Learning for floating Ventilation | 517 |
Helicopter Flight Control Using Direct Neural Dynamic Programming | 535 |
Toward Dynamic Stochastic Optimal Power Flow | 561 |
Control Optimization Security and Selfhealing of Benchmark | 599 |
Common terms and phrases
A-LSPE action network actor adaptive critic designs agent algorithm analysis applications approach approximate dynamic programming approximate LP Artificial Intelligence backpropagation Barto basis functions behavior Belhnan chapter computational constraints control law control problems convergence cost cost-to-go function critic network curse of dimensionality defined derivatives direct NDP discussed equation error estimate example Figure formulation function approximation fuzzy goal Heuristic hierarchical IEEE Trans implemented improve input Intelligence leaming learning algorithm linear programming Lyapunov function Machine Learning Markov chain Markov decision processes matrix methods minimize module neural network neurocontroller node nonlinear operating optimal control optimal policy output parameters performance Policy Gradient power system Proc Q-leaming Q-learning recurrent reinforcement learning reward robot robust control sample Section simulation solution solving space stability stochastic structure supervised learning task TD(A Theorem trajectory transition update Utility function value function variables vector voltage weights Werbos