Linear Programming and Finite Markovian Control Problems |
Common terms and phrases
a e A(i algorithm AMD-model assumption average optimal policy average reward bias optimal policy completely ergodic completes the proof compute an optimal Consequently constraints construction contracting dynamic programming corresponding defined DENARDO denote dual linear programming dynamic programming problem ergodic set exists extreme feasible solution extreme optimal solution extreme point f₁ follows go to step Hence i e Ẽ ia a e A(i ia ia implies infeasible infinite solution Iteration lemma Let f linear programming problem Markov chain induced Markov decision problem obtain optimal stopping Piaj policy f pure and stationary REMARK reward criterion satisfies semi-Markov simplex method simplex tableau solution of problem solution of program stationary average optimal stationary optimal policy stationary policy stochastic game superharmonic Suppose transition probabilities unichain v₁ val TMG variables vector X₁