Finite State Markovian Decision Processes |
From inside the book
Results 1-3 of 28
Page 14
... follows from the induction assumption . The right - hand side is again . independent of hn - 1 . Hence , V ( R * , i ... follow from the fact that V * ( i ) = V1 ( R * , i , h2 - 1 ) , for i e I and n = 0 , 1 , ... , T and the last ...
... follows from the induction assumption . The right - hand side is again . independent of hn - 1 . Hence , V ( R * , i ... follow from the fact that V * ( i ) = V1 ( R * , i , h2 - 1 ) , for i e I and n = 0 , 1 , ... , T and the last ...
Page 24
... follows that R • ( i , α ) = YR ( i , α ) , ie I. This establishes the optimality of Ro and also that Ro ( i , a ) ... follows ( see Theorem 3 of Appendix A ) that ( i , α ) is a rational function of a for 0 ≤ a < 1. Let { α , n = 1 , 2 ...
... follows that R • ( i , α ) = YR ( i , α ) , ie I. This establishes the optimality of Ro and also that Ro ( i , a ) ... follows ( see Theorem 3 of Appendix A ) that ( i , α ) is a rational function of a for 0 ≤ a < 1. Let { α , n = 1 , 2 ...
Page 44
... follows from the complementary slackness property of primal and dual linear programming problems ( Theorem 5 , Appendix C ) that if { v } , je I } is optimal for the primal problem , then Vi = Wia + α aΣ q1 ( a ) v ; for those values of ...
... follows from the complementary slackness property of primal and dual linear programming problems ( Theorem 5 , Appendix C ) that if { v } , je I } is optimal for the primal problem , then Vi = Wia + α aΣ q1 ( a ) v ; for those values of ...
Contents
Problems to Be Treated | 5 |
Finite Horizon Expected Cost Minimization | 11 |
Bibliographical Remarks | 17 |
Copyright | |
21 other sections not shown
Other editions - View all
Common terms and phrases
a e K₁ A₁ ú(i AOQL Appendix arbitrary Bibliographical Remarks Chapter constraints continuous function convex set Corollary defined denote Derman discounted cost criterion dual linear programming dual problem dynamic programming E{wy equality in 13 equations expected average cost expected discounted cost extreme point feasible solution finite number given Hence Hº(i Horizon Problem inspection Lemma lim inf Linear Programming Formulations linear programming problem Markov chain Markovian decision process Math maximize method of successive minimize subject nondecreasing obtained optimal policy optimal solution optimal stopping OR(i P₁ policy improvement iteration policy improvement procedure policy Re CD PR{Y₁ primal problem R₁ R₂ random variables recurrent satisfies stochastic process strict inequality holding successive approximations Suppose takes action transient transition probabilities traveling salesman problem UR(i v₁ Veinott VR(i W₁ Y₁ Y₂ YR(i α Σ Σ Σ ΣΣ