Volume 6 Issue 1
Jan.  2019

IEEE/CAA Journal of Automatica Sinica

Dimitri P. Bertsekas, "Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations," IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 1-31, Jan. 2019.
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations

doi: 10.1109/JAS.2018.7511249
  • In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.


