Aligning Robot Representations with Humans
Workshop CoRL 2022 - December 15th (Hybrid)
In person location: ENG Building 405 Room 470 (405.470)
In person poster session: 4th floor of ENG 405; Gather.Town session: Link
Submit your questions in the Pheedloop chat
Massachusetts Institute of Technology
University of Utah
Georgia Institute of Technology
|08:30 am - 08:45 am|| Organizers
|08:45 am - 09:15 am|| Mark Ho
Artificial intelligence, natural stupidity, and resource rational cognition
There is a fundamental tension in AI and cognitive science between human intelligence (we want to build systems with human-like intelligence) and human stupidity (we know that humans are cognitively limited and can be irrational). As the psychologist Amos Tversky, whose work on people's cognitive biases won the Nobel Prize in Economics, put it: "My colleagues, they study artificial intelligence; me, I study natural stupidity." How can these two views on human cognition be reconciled and inform the design of AI systems? My talk will discuss recent advances in resource rationality, a general theoretical framework that seeks to explain humans' puzzling combination of intelligence and stupidity as a consequence of our condition as boundedly rational decision makers. I will focus on my own work on resource rational representations, the challenges and promise of this approach, and how this perspective can help guide the development of AI systems that effectively and safely help us overcome our cognitive limitations.
|09:15 am - 09:55 am|| Jacob Andreas
Toward natural language supervision
In the age of deep networks, "learning" almost invariably means "learning from examples". Image classifiers are trained with large datasets of images, machine translation systems with corpora of translated sentences, and robot policies with rollouts or demonstrations. When human learners acquire new concepts and skills, we often do so with richer supervision, especially in the form of language---we learn new concepts from exemplars accompanied by descriptions or definitions, and new skills from demonstrations accompanied by instructions. In natural language processing, recent years have seen a number of successful approaches to learning from task definitions and other forms of auxiliary language-based supervision. But these successes have been largely confined to tasks that also involve language as an input and an output---what will it take to make language-based training useful for the rest of the machine learning ecosystem? In this talk, I'll present two recent applications of natural language supervision to tasks outside the traditional domain of NLP: using language to guide visuomotor policy learning and inductive program synthesis. In these applications, natural language annotations reveal latent compositional structure in the space of programs and plans, helping models discover reusable abstractions for perception and interaction. This kind of compositional structure is present in many tasks beyond policy learning and program synthesis, and I'll conclude with a brief discussion of how these techniques can be applied even more generally.
|09:45 am - 10:00 am||Coffee Break|
|10:00 am - 10:30 am|| Lerrel Pinto
Teaching Robots to Manipulate in an Hour
I want to teach robots complex and dexterous behaviors in diverse real-world environments. But what is the fastest way to teach robots in the real world? — Among the prominent options in our robot learning toolbox, Sim2real requires careful modeling of the world, while real-world self-supervised learning or RL is far too slow. Currently, the only reasonably efficient approach that I know of is imitating humans. But making imitation learning feasible on real robots is not ‘easy’. They often require complicated demonstration collection setups, rely on having expert roboticists train them, and even then need a significant number of demonstrations to learn effectively. In this talk, I will present two ideas that can make robots learning far easier than it currently is. First, to collect demonstrations more easily we will use vision-based demonstration collection devices. This allows untrained humans to easily collect demonstrations from consumer-grade products. Second, to learn from these visual demonstrations, I will propose a new imitation learning algorithm that puts data efficiency on the forefront. Together this allows for significantly faster and easier imitation on a variety of real-world manipulation tasks.
|10:30 am - 11:00 am|| Matthew Gombolay
Confronting the Correspondence Problem with Self-supervised and Interactive Machine Learning
New advances in robotics offer a promise of revitalizing final assembly manufacturing, assisting in personalized at-home healthcare, and even scaling the power of earth-bound scientists for robotic space exploration. Yet, manually programming robots for each end user's ad hoc needs is intractable. Interactive Machine Learning techniques seek to enable end users to intuitively program robots such as through skill demonstration, natural language instruction, and feedback. Yet, humans and robots alike struggle in situated learning interactions because of the correspondence problem: humans and robots perceive, think, and physically act differently. In this talk, I will present our latest work in developing interactive machine learning methods that seek to (1) enable users to program robots intuitively, (2) enable robots to characterize misspecified input and feedback from human end-users, and (3) close the loop on situated learning interactions through explainable Artificial Intelligence (XAI) techniques. The outcome of our research is a set of design principles that go towards addressing fundamental issues of the correspondence problem for democratizing robotics.
|11:00 am - 12:00 pm||Contributed Talks|
|12:00 pm - 12:30 pm|| Daniel Brown
Latent Spaces and Learned Representation for Better Human Preference Learning
In this talk I will discuss some of our recent work that uses latent spaces and representation learning to enable better human-robot interaction. I will discuss the importance of having the “right” latent space to better teach robots to act in ways that are aligned with human preferences, approaches for learning latent space embeddings for efficient Bayesian reward learning and generalizable robot assistance, and the use of task-agnostic similarity queries as a step towards the goal of enabling efficient learning of multiple down-stream tasks using a single shared representation.
|12:30 pm - 01:30 pm||Lunch Break|
|01:30 pm - 02:00 pm||Coffee Break|
|02:00 pm - 02:30 pm||Conference Opening Session|
|02:30 pm - 03:00 pm|| Amy Zhang
Attending to What Matters with Representation Learning
In this talk, we focus on three different ways to extract additional signal from various, easily available data sources to improve human-robot alignment. We first present how state abstractions can accelerate reinforcement learning from rich observations, such as images, by disentangling task-relevant from irrelevant details using reward signal. However, while reward is the canonical way to specify task in reinforcement learning, it is often difficult to specify a well-shaped reward function in robotics. We then focus on goal-conditioned tasks and ways to extract and generalize functional equivariance. Finally, we explore how human demonstrations can be used to learn a representation that captures dense reward signal for robotics tasks.
|03:00 pm - 03:30 pm|| Dorsa Sadigh
Aligning Humans and Robots : Active Elicitation of Informative and Compatible Queries
Aligning robot objectives with human preferences is a key challenge in robot learning. In this talk, I will start with discussing how active learning of human preferences can effectively query humans with the most informative questions to learn their preference reward functions. I will discuss some of the limitations of prior work, and how approaches such as few-shot learning can be integrated with active preference based learning for the goal of reducing the number of queries to a human expert and allowing for truly bringing in humans in the loop of learning neural reward functions. I will then talk about how we could go beyond active learning from a single human, and tap into large language models (LLMs) as another source of information to capture human preferences that are hard to specify. I will discuss how LLMs can be queried within a reinforcement learning loop and help with reward design. Finally I will discuss how the robot can also provide useful information to the human and be more transparent about its learning process. We demonstrate how the robot’s transparent behavior would guide the human to provide compatible demonstrations that are more useful and informative for learning.
|03:30 pm - 04:00 pm|| George Konidaris
Reintegrating AI: Skills, Symbols, and the Sensorimotor Dilemma
I will address the question of how a robot should learn an abstract, task-specific representation of an environment, which I will argue is the key capability required to achieve generally-intelligent robots. I will present a constructivist approach, where the computation the representation is required to support - here, planning using a given set of motor skills - is precisely defined, and then its properties are used to build the representation so that it is capable of doing so by construction. The result is a formal link between the skills available to a robot and the symbols it should use to plan with them. I will present an example of a robot autonomously learning a (sound and complete) abstract representation directly from sensorimotor data, and then using it to plan. I will also discuss ongoing work on making the resulting abstractions portable across tasks.
|04:00 pm - 05:00 pm||
|05:00 pm - 05:10 pm|| Organizers
|05:10 pm - 06:00 pm||In person: 4th floor of ENG 405; Virtual: On Gather.Town
Congratulations to Abhijat Biswas (Mitigating causal confusion in driving agents via gaze supervision) and Ruohan Zhang (A Dual Representation Framework for Robot Learning with Human Guidance) for each winning a Best Paper Award!