From Pixels to Torques

 

The vision of fully autonomous and intelligent systems that learn by themselves has influenced AI and robotics research for many decades. To devise fully autonomous systems, it is necessary to (1) process perceptual data (e.g., images) to summarize knowledge about the surrounding environment and the system’s behavior in this environment, (2) make decisions based on uncertain and incomplete information, (3) take new information into account for learning and adaptation. Effectively, any fully autonomous system has to close this perception-action-learning loop without relying on specific human expert knowledge.
The pixels to torques problem identifies key aspects of an autonomous system: autonomous thinking and decision making using sensor measurements only, intelligent exploration and learning from mistakes.

We consider the scenario where a camera observes a scene, and a robot is moving about (see figure above).  The objective is to learn a closed-loop control policy from pixel information only, such that the robot solves a particular task, while keeping the number of trials small.

To solve this task data efficiently, we propose to learn compact representations of images, which we use to learn predictive models and controllers in this lower-dimensional feature space.  In particular, our approach to learning from pixels to torques is to jointly learn a lower-dimensional embedding of images and a transition function that we can use for internal simulation of the dynamical system. For this purpose, we employ deep auto-encoders for the lower-dimensional embedding and a multi-layer feed-forward neural network for the transition function. We use this deep dynamical model as a generative model for trajectories and apply an adaptive model-predictive-control (MPC) algorithm for online closed-loop control of the robotic agent, which is practically based on pixel information only.

 

Contact
Marc Deisenroth

Collaborators
Niklas Wahlström (Division of Automatic Control, Linköping University, Sweden )
Thomas B. Schön (Department of Information Technology, Uppsala University, Sweden)

Delicious Twitter Digg this StumbleUpon Facebook