Tez No İndirme Tez Künye Durumu
400894
A sample-path approach to time-average Markov decision processes /
Yazar:MELİKE BAYKAL GÜRSOY
Danışman: DR. KEITH W. ROSS
Yer Bilgisi: University of Pennsylvania / Yurtdışı Enstitü
Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control
Dizin:
Onaylandı
Doktora
İngilizce
1988
111 s.
Time-average Markov decision problems are considered for the finite state and action spaces. Several definitions of variability are introduced and compared. For multichain case, it is shown that a stationary policy maximizes one of the criteria, namely, the expected long-run average variability. An algorithm which uses a decomposition approach to locate such an optimal policy is given. The algorithm produces an optimal pure policy under convexity conditions for the variability function. The unichain semi-Markov decision processes are examined. It is shown that a stationary policy maximizes the expected average reward subject to the condition that the longrun average cost is below certain level with probability 1. A fractional program is presented which produces such an optimal stationary policy. Two-person zero-sum stochastic games are also considered. In the case that only one player controls the transition probabilities, stationary policies are shown to exist which give the saddlepoint solution for multichained expected long-run average reward. An algorithm using the decomposition theory is developed to find optimal stationary policies for both players. In the case that both players control the transition probabilities a generalized game is obtained. The solution of this game gives optimal stationary policies for the players if the game is irreducible.