Integrating Human Demonstrations in Hierarchical Reinforcement Learning

When:
31/03/2023 – 01/04/2023 all-day
2023-03-31T02:00:00+02:00
2023-04-01T02:00:00+02:00

Offre en lien avec l’Action/le Réseau : – — –/– — –

Laboratoire/Entreprise : ENSTA Paris, Computer Science and System Engineeri
Durée : 6 months
Contact : sao-mai.nguyen@ensta-paris.fr
Date limite de publication : 2023-03-31

Contexte :
Fully autonomous robots have the potential to impact real-life applications, like assisting elderly people. Autonomous robots must deal with uncertain and continuously changing environments, where it is not possible to program the robot tasks. Instead, the robot must continuously learn new tasks and how to perform more complex tasks combining simpler ones (i.e., a task hierarchy). This problem is called lifelong learning of hierarchical tasks.

Sujet :
Hierarchical Reinforcement Learning (HRL) is a recent approach for learning to solve long and complex tasks by decomposing them into simpler subtasks. HRL could be regarded as an extension of the standard Reinforcement Learning (RL) setting as it features high-level agents selecting subtasks to perform and low-level agents learning actions or policies to achieve them. We recently proposed a HRL algorithm, GARA (Goal Abstraction via Reachability Analysis), that aims to learn an abstract model of the subgoals of the hierarchical task.
However, HRL can still be limited when faced with the states with high dimension and the real-world open-ended environment. Introducing a human teacher to Reinforcement Learning algorithms has been shown to bootstrap the learning performance. Moreover, active imitation learners such as in [1] have shown that they can strategically choose the most useful questions to ask to a human teacher : they can choose, who, when, what and whom to ask for demonstrations [2,3].
This internship’s goal is to explore how active imitation can improve the algorithm GARA. The intuition in this context is that human demonstrations can be used to determine the structure of the task (ie. which subtasks need to be achieved) as well as determining a planning strategy to solve it (ie. the order of achieving subtasks).
During this internship we will :
• Study the relevant state-of-art and make a research hypothesis about the
usefulness of introducing human demonstrations into the considered HRL
algorithm.
• Design and implement a component to learn from human demonstrations in
GARA.
• Conduct an experimental evaluation to assess the research hypothesis.
The intern is expected to also collaborate with a PhD student whose work is closely related to this topic.

Profil du candidat :
The intern should be enrolled in a master program (either M1 or M2) in Computer Science or Robotics.

Formation et compétences requises :
The students should have a prior knowledge (e.g., followed some course) in machine learning, deep learning, and reinforcement learning, and be motivated to complete a research-focused internship.

Adresse d’emploi :
ENSTA Paris, Computer Science and System Engineering Department

Document attaché : 202302021428_internshipActiveImitationLearning.pdf