Factored Markov Decision Processes 1
Solution methods described in the MDP framework (Chapters 1 and 2) share a common bottleneck: they are not adapted to solve large problems. Indeed, using non-structured representations requires an explicit enumeration of the possible states in the problem. The complexity of this enumeration grows exponentially with the number of variables in the problem.
EXAMPLE 4.1. In the case of the car to be maintained, the number of possible states of a car can be huge. For instance, each part of the car can have its one wearout state. The idea of factored representations is that some parts of this huge state do not depend on each other and that this structure can be exploited to derive a more compact representation of the global state and obtain more efficiently an optimal policy. For instance, changing the oil in the car should have no effect on the breaks, thus we do not need to care about the state of the breaks to determine an optimal oil changing policy.
This chapter aims to describe FMDPs (Factored Markov Decision Processes), first proposed by [BOU 95, BOU 99]. FMDPs are an extension of MDPs that makes it possible to represent the transition and the reward functions of some problems compactly (compared to an explicit enumeration of state-action pairs). First, we describe the framework and how problems are modeled (section 4.2). Then we describe different planning methods able to take advantage of the structure of the problem ...