Ulusal Tez Merkezi

Tez No	İndirme	Tez Künye	Durumu
400894		A sample-path approach to time-average Markov decision processes / Yazar:MELİKE BAYKAL GÜRSOY Danışman: DR. KEITH W. ROSS Yer Bilgisi: University of Pennsylvania / Yurtdışı Enstitü Konu:Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol = Computer Engineering and Computer Science and Control Dizin:	Onaylandı Doktora İngilizce 1988 111 s.



Time-average Markov decision problems are considered for the finite state and action spaces. Several definitions of variability are introduced and compared. For multichain case, it is shown that a stationary policy maximizes one of the criteria, namely, the expected long-run average variability. An algorithm which uses a decomposition approach to locate such an optimal policy is given. The algorithm produces an optimal pure policy under convexity conditions for the variability function. The unichain semi-Markov decision processes are examined. It is shown that a stationary policy maximizes the expected average reward subject to the condition that the longrun average cost is below certain level with probability 1. A fractional program is presented which produces such an optimal stationary policy. Two-person zero-sum stochastic games are also considered. In the case that only one player controls the transition probabilities, stationary policies are shown to exist which give the saddlepoint solution for multichained expected long-run average reward. An algorithm using the decomposition theory is developed to find optimal stationary policies for both players. In the case that both players control the transition probabilities a generalized game is obtained. The solution of this game gives optimal stationary policies for the players if the game is irreducible.