PROFESSOR OF COMPUTER SCIENCE, BROWN UNIVERSITY
ASSESSING AND IMPROVING GENERALIZATION IN DEEP REINFORCEMENT LEARNING
Deep reinforcement-learning approaches have been shown to produce a remarkable performance on a range of challenging control tasks. Observations of the resulting behavior give the impression that agents construct rich task representations that support insightful action decisions. Looking closely at the generalization capacity of these deep Q-networks, however, we find that the learned value computations often reduce to brittle memorization and that the network does not know how to handle even small non-adversarial modifications to the states it encounters during execution. We examine training methods that improve generalization capability. Our results provide strong evidence that not all deep networks learn robust behaviors, and that careful consideration must be made to achieve results to the contrary.
Coffee and pastry to be served.