Learning to Follow Grounded Language Instructions in the "REAL" World
|Date||13 Jul 2018 (Friday)|
|Time||10:30 – 11:30|
|Venue||Padma & Hari Harilela Lecture Theater C, 1/F, Academic Building, HKUST|
|Speaker||Dr. Edward T. GREFENSTETTE|
|Staff Research Scientist, Google DeepMind|
Honorary Reader & Associate Professor at University College London
Reinforcement Learning (RL) generally presupposes the availability of possibly sparse–but primarily correct– reward signal from the environment, with which to reward an agent for behaving appropriately within the context of a task. Teaching agents to follow instructions using RL is a quintessentially multi-task problem: each instruction in a possibly combinatorically rich language corresponds to a specific task for which there must be a reward function against which the agent will learn. This has largely limited the RL community, thus far, to forms of instruction languages (e.g. templated instructions) where families of reward functions can be specified, and individual reward functions can be generated. In this talk, I discuss a new method which will allow us to take a step towards RL "in the wild", exploring a richer set of instruction languages, and enabling us to expose agents to a rich variety of tasks without needing to perpetually design reward functions over environment states.
Dr. Edward T. Grefenstette is a Staff Research Scientist at DeepMind, and Honorary Associate Professor at UCL. He completed his doctorate in the Quantum Group of Oxford's Department of Computer Science. Following a short postdoc in Oxford, during which he held a Junior Research Fellowship at Somerville College, he started Dark Blue Labs with Karl Moritz Hermann, Phil Blunsom, and Nando de Freitas, which was acquired by DeepMind in 2014.