TY - JOUR TI - Local planning for continuous Markov decision processes DO - https://doi.org/doi:10.7282/T3BR8Q83 PY - 2014 AB - In this dissertation, algorithms that create plans to maximize a numeric reward over time are discussed. A general formulation of this problem is in terms of reinforcement learning (RL), which has traditionally been restricted to small discrete domains. Here, we are concerned instead with domains that violate this assumption, as we assume domains are both continuous and high dimensional. Problems of swimming, riding a bicycle, and walking are concrete examples of domains satisfying these assumptions, and simulations of these problems are tackled here. To perform planning in continuous domains, it has become common practice to use discrete planners after uniformly discretizing dimensions of the problem, leading to an exponential growth in problem size as dimension increases. Furthermore, traditional methods develop a policy for the entire domain simultaneously, but have at best polynomial planning costs in the size of the problem, which (as mentioned) grows exponentially with respect to dimension when uniform discretization is performed. To sidestep this problem, I propose a twofold approach of: using algorithms designed to function natively in continuous domains, and performing planning locally. By developing planners that function natively in continuous domains, difficult decisions related to how coarsely to discretize the problem are avoided, which allows for more flexible and efficient algorithms that more efficiently allocate and use samples of transitions and rewards. By focusing on local planning algorithms, it is possible to somewhat sidestep the curse of dimensionality, as planning costs are dependent on planning horizon as opposed to domain size. The properties of some local continuous planners are discussed from a theoretical perspective. Empirically, the superiority of continuous planners is demonstrated with respect to their discrete counterparts. Both theoretically and empirically, it is shown that algorithms designed to operate natively in continuous domains are simpler to use while providing higher quality results, more efficiently. KW - Computer Science KW - Markov processes KW - Markov processes--Numerical solutions KW - Reinforcement learning LA - eng ER -