Learning Human Preferences for Physical Human-Robot Cooperation

L.F. van der Spaa

doi:10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20

Learning Human Preferences for Physical Human-Robot Cooperation

L.F. van der Spaa

Learning & Autonomous Control

Research output: Thesis › Dissertation (TU Delft)

18 Downloads (Pure)

Abstract

Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an important role. Preferences tend to be highly personal, and additionally depend on the cooperation partner and the cooperation itself. They can be hard to define in terms a robot would understand, and may change over time. This thesis focuses on learning ‘useful models’ from observed behavior, to let our robot adapt its behavior to better match its human partner’s preferences, and thus improve the cooperation.
The aim is to capture personalized approximate models of human preferences –how a person likes to do something– from very few interactive observations, providing only small amounts of imprecise data, such that the robot can use the model to improve each user’s comfort. First, we learn a model to predict and optimize the human ergonomics in a pHRC task, such that our robot can ropose a plan, for both the human and itself, to solve the task in a way that is more ergonomic for its human partner. However, people do not necessarily prefer to act ergonomically, nor do we want to impose on them what a robot thinks best. Therefore, next, we apply inverse reinforcement learning (IRL), to capture less restrictive preference models: 1) path and velocity preferences for motion planning, and 2) on a higher level of abstraction, which (grasp or motion) action to initiate for proactive physical support. For learning to take the correct action in cooperation, we developed the disagreement-aware variable impedance (DAVI) controller to smoothly transition between providing active guidance and allowing the human to demonstrate alternative behavior.....

Original language	English
Qualification	Doctor of Philosophy
Awarding Institution	Delft University of Technology
Supervisors/Advisors	Kober, J., Supervisor Babuska, R., Supervisor
Thesis sponsors	Honda Research Institute Europe
Award date	1 Feb 2024
Print ISBNs	978-94-6483-764-3
Electronic ISBNs	978-94-6483-779-7
DOIs	https://doi.org/10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20
Publication status	Published - 2024

Bibliographical note

Dr. M. Gienger contributed significantly to the realization of the dissertation.

Keywords

Physical Human-Robot Interaction
Human-Robot Collaboration
human preferences
human-centered planning
Inverse Reinforcement Learning

Access to Document

10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20

LFvdSpaa_DissertationFinal published version, 154 MB

Cite this

@phdthesis{90f0c7fe34db45f3bd2b7fec91075d20,

title = "Learning Human Preferences for Physical Human-Robot Cooperation",

abstract = "Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner{\textquoteright}s intention and preferences plays an important role. Preferences tend to be highly personal, and additionally depend on the cooperation partner and the cooperation itself. They can be hard to define in terms a robot would understand, and may change over time. This thesis focuses on learning {\textquoteleft}useful models{\textquoteright} from observed behavior, to let our robot adapt its behavior to better match its human partner{\textquoteright}s preferences, and thus improve the cooperation.The aim is to capture personalized approximate models of human preferences –how a person likes to do something– from very few interactive observations, providing only small amounts of imprecise data, such that the robot can use the model to improve each user{\textquoteright}s comfort. First, we learn a model to predict and optimize the human ergonomics in a pHRC task, such that our robot can ropose a plan, for both the human and itself, to solve the task in a way that is more ergonomic for its human partner. However, people do not necessarily prefer to act ergonomically, nor do we want to impose on them what a robot thinks best. Therefore, next, we apply inverse reinforcement learning (IRL), to capture less restrictive preference models: 1) path and velocity preferences for motion planning, and 2) on a higher level of abstraction, which (grasp or motion) action to initiate for proactive physical support. For learning to take the correct action in cooperation, we developed the disagreement-aware variable impedance (DAVI) controller to smoothly transition between providing active guidance and allowing the human to demonstrate alternative behavior.....",

keywords = "Physical Human-Robot Interaction, Human-Robot Collaboration, human preferences, human-centered planning, Inverse Reinforcement Learning",

author = "{van der Spaa}, L.F.",

note = "Dr. M. Gienger contributed significantly to the realization of the dissertation.",

year = "2024",

doi = "10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20",

language = "English",

isbn = "978-94-6483-764-3",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Learning Human Preferences for Physical Human-Robot Cooperation

AU - van der Spaa, L.F.

N1 - Dr. M. Gienger contributed significantly to the realization of the dissertation.

PY - 2024

Y1 - 2024

N2 - Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an important role. Preferences tend to be highly personal, and additionally depend on the cooperation partner and the cooperation itself. They can be hard to define in terms a robot would understand, and may change over time. This thesis focuses on learning ‘useful models’ from observed behavior, to let our robot adapt its behavior to better match its human partner’s preferences, and thus improve the cooperation.The aim is to capture personalized approximate models of human preferences –how a person likes to do something– from very few interactive observations, providing only small amounts of imprecise data, such that the robot can use the model to improve each user’s comfort. First, we learn a model to predict and optimize the human ergonomics in a pHRC task, such that our robot can ropose a plan, for both the human and itself, to solve the task in a way that is more ergonomic for its human partner. However, people do not necessarily prefer to act ergonomically, nor do we want to impose on them what a robot thinks best. Therefore, next, we apply inverse reinforcement learning (IRL), to capture less restrictive preference models: 1) path and velocity preferences for motion planning, and 2) on a higher level of abstraction, which (grasp or motion) action to initiate for proactive physical support. For learning to take the correct action in cooperation, we developed the disagreement-aware variable impedance (DAVI) controller to smoothly transition between providing active guidance and allowing the human to demonstrate alternative behavior.....

AB - Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in which awarenes of the partner’s intention and preferences plays an important role. Preferences tend to be highly personal, and additionally depend on the cooperation partner and the cooperation itself. They can be hard to define in terms a robot would understand, and may change over time. This thesis focuses on learning ‘useful models’ from observed behavior, to let our robot adapt its behavior to better match its human partner’s preferences, and thus improve the cooperation.The aim is to capture personalized approximate models of human preferences –how a person likes to do something– from very few interactive observations, providing only small amounts of imprecise data, such that the robot can use the model to improve each user’s comfort. First, we learn a model to predict and optimize the human ergonomics in a pHRC task, such that our robot can ropose a plan, for both the human and itself, to solve the task in a way that is more ergonomic for its human partner. However, people do not necessarily prefer to act ergonomically, nor do we want to impose on them what a robot thinks best. Therefore, next, we apply inverse reinforcement learning (IRL), to capture less restrictive preference models: 1) path and velocity preferences for motion planning, and 2) on a higher level of abstraction, which (grasp or motion) action to initiate for proactive physical support. For learning to take the correct action in cooperation, we developed the disagreement-aware variable impedance (DAVI) controller to smoothly transition between providing active guidance and allowing the human to demonstrate alternative behavior.....

KW - Physical Human-Robot Interaction

KW - Human-Robot Collaboration

KW - human preferences

KW - human-centered planning

KW - Inverse Reinforcement Learning

U2 - 10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20

DO - 10.4233/uuid:90f0c7fe-34db-45f3-bd2b-7fec91075d20

M3 - Dissertation (TU Delft)

SN - 978-94-6483-764-3

ER -

Learning Human Preferences for Physical Human-Robot Cooperation

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this