Robots are destined to move beyond the caged factory floors towards domains
where they will be interacting closely with humans. They will encounter highly varied
environments, scenarios and user demands. As a result, programming robots after deployment
will be an important requirement. To address this challenge, the field of Learning from
Demonstration (LfD) emerged with the vision of programming robots through demonstrations
of the desired behavior instead of explicit programming. The field of LfD within
robotics has been around for more than 30 years and is still an actively researched field.
However, very little research is done on the implications of having a non-robotics expert
as a teacher. This thesis aims to bridge this gap by developing learning from demonstration
algorithms and interaction paradigms that allow non-expert people to teach robots new
skills.
The first step of the thesis was to evaluate how non-expert teachers provide demonstrations
to robots. Keyframe demonstrations are introduced to the field of LfD to help
people teach skills to robots and compared with the traditional trajectory demonstrations.
The utility of keyframes are validated by a series of experiments with more than 80 participants.
Based on the experiments, a hybrid of trajectory and keyframe demonstrations are
proposed to take advantage of both and a method was developed to learn from trajectories,
keyframes and hybrid demonstrations in a unified way.
A key insight from these user experiments was that teachers are goal oriented. They
concentrated on achieving the goal of the demonstrated skills rather than providing good
quality demonstrations. Based on this observation, this thesis introduces a method that can
learn actions and goals from the same set of demonstrations. The action models are used
to execute the skill and goal models to monitor this execution. A user study with eight participants and two skills showed that successful goal models can be learned from nonexpert
teacher data even if the resulting action models are not as successful. Following
these results, this thesis further develops a self-improvement algorithm that uses the goal
monitoring output to improve the action models, without further user input. This approach
is validated with an expert user and two skills. Finally, this thesis builds an interactive LfD
system that incorporates both goal learning and self-improvement and evaluates it with 12
naive users and three skills. The results suggests that teacher feedback during experiments
increases skill execution and monitoring success. Moreover, non-expert data can be used
as a seed to self-improvement to fix unsuccessful action models. |