Finite State Markovian Decision Processes |
From inside the book
Results 1-3 of 41
Page ix
... Appendix A. Markov Chains Bibliographical Notes 139 142 Appendix B. Some Theorems from Analysis and Probability Theory 143 Bibliographical Notes 147 Appendix C. Convex Sets and Linear Programming 149 Bibliographical Notes.
... Appendix A. Markov Chains Bibliographical Notes 139 142 Appendix B. Some Theorems from Analysis and Probability Theory 143 Bibliographical Notes 147 Appendix C. Convex Sets and Linear Programming 149 Bibliographical Notes.
Page 25
... Appendix B , since ( i ) = lim SR , T ( i ) / T + 1 when Re CD ( a consequence of T → ∞ Theorem 1 of Appendix A ) , we have PR . ( i ) = lim ( 1 - α ) YR . ( i , α ) ; from Theorem 1 ( c ) of Appendix B we a → 1 Expected Average Cost ...
... Appendix B , since ( i ) = lim SR , T ( i ) / T + 1 when Re CD ( a consequence of T → ∞ Theorem 1 of Appendix A ) , we have PR . ( i ) = lim ( 1 - α ) YR . ( i , α ) ; from Theorem 1 ( c ) of Appendix B we a → 1 Expected Average Cost ...
Page 27
... Appendix B ) and by Theorem 6 , Appendix B , lim_SR , T ( i ) / ( T + 1 ) T → ∞ exists . Hence , dr + ( i ) ≤ or ( i ) . Now let us suppose there exists a policy R such that ( i ) < ØR • ( i ) for some i . Then there will be an & > 0 ...
... Appendix B ) and by Theorem 6 , Appendix B , lim_SR , T ( i ) / ( T + 1 ) T → ∞ exists . Hence , dr + ( i ) ≤ or ( i ) . Now let us suppose there exists a policy R such that ( i ) < ØR • ( i ) for some i . Then there will be an & > 0 ...
Contents
Problems to Be Treated | 5 |
Finite Horizon Expected Cost Minimization | 11 |
Bibliographical Remarks | 17 |
Copyright | |
21 other sections not shown
Other editions - View all
Common terms and phrases
a e K₁ A₁ ú(i AOQL Appendix arbitrary Bibliographical Remarks Chapter constraints continuous function convex set Corollary defined denote Derman discounted cost criterion dual linear programming dual problem dynamic programming E{wy equality in 13 equations expected average cost expected discounted cost extreme point feasible solution finite number given Hence Hº(i Horizon Problem inspection Lemma lim inf Linear Programming Formulations linear programming problem Markov chain Markovian decision process Math maximize method of successive minimize subject nondecreasing obtained optimal policy optimal solution optimal stopping OR(i P₁ policy improvement iteration policy improvement procedure policy Re CD PR{Y₁ primal problem R₁ R₂ random variables recurrent satisfies stochastic process strict inequality holding successive approximations Suppose takes action transient transition probabilities traveling salesman problem UR(i v₁ Veinott VR(i W₁ Y₁ Y₂ YR(i α Σ Σ Σ ΣΣ