Fan, Jingnan. Process-based risk measures and risk-averse control of observable and partially observable discrete-time systems. Retrieved from https://doi.org/doi:10.7282/T3PC35KX
DescriptionIn this thesis, we develop theoretical foundations of the theory of dynamic risk measures for controlled stochastic processes, and we apply our theory to Markov decision processes (MDP) and partially observable Markov decision processes (POMDP). We consider a new class of dynamic risk measures for controlled discrete-time stochastic processes, which we call process-based. By introducing a new concept of stochastic conditional time consistency, we derive the structure of process-based risk measures enjoying this property. It is shown that such risk measures can be equivalently represented by a collection of static law-invariant risk measures on the space of functions of the state of the base process. The results are first specialized to Markov decision problems (MDP), in which we use process-based dynamic risk measures to evaluate control policies. We derive the refined structure of risk measures for this kind of problems, along with the associated dynamic programming equations. We then specialize our theory to partially observable Markov decision problems (POMDP). Compared to MDP, in POMDP we can only observe part of the state, and we need to infer the rest of the state conditional on our observations. We derive that the stochastically conditionally time-consistent dynamic risk measures can be represented by a sequence of law-invariant risk measures on the space of function of the observable part of the state. The corresponding dynamic programming equations are also derived. Finally, as an application to our theory on POMDP, we study a model for machine deterioration problem.