MDP's with continuous state space (scratch)
Posted: 2016-10-14 , Modified: 2016-10-25
Tags: none
Posted: 2016-10-14 , Modified: 2016-10-25
Tags: none
Come up with a class of MDPs on exponentially large/continuous space that is interesting and tractable. Think of generalizing from contextual bandits * Basically we want a reasonable model of a MDP with a very large (exponential or continuous) state space and be able to do something with it. Wanted to include some dynamics like in Kalman filters but we weren’t sure whether Kalman filters are tractable * Todo: learn about Kalman filters
This setting looks like reinforcement learning + control theory. Prior work? How is RL used in continuous systems right now? Basic control theory background?
Need the model to be a generalization of regular MDP.
(*) may be interesting from control theory perspective, but doesn’t generalize discrete MDP. (Seems like best to learn the dynamics, and then do optimal thing from there…)
Captures deterministic MDP, but not probabilistic, by letting \(A=\{e_i\}\).
Do as well as best Bayes net? Actions in some class. Finite set of actions, vs. exponential/continuous set of actions. In latter case, will depend on optimizability of that set…
Ex. class is a SVM.
“Do as well as best estimator of \(q\) function in a certain class (assume convexity or something?)” (cf. contextual bandits first)