This paper addresses the problem of learning dynamic models of hybrid systems from demonstrations and then the problem of imitation of those demonstrations by using Bayesian filtering. A linear programming-based approach is used to develop nonparametric kernel-based conditional density estimation technique to infer accurate and concise dynamic models of system evolution from data. The training data for these models have been acquired from demonstrations by teleoperation. The trained data-driven models for mode-dependent state evolution and state-dependent mode evolution are then used online for imitation of demonstrated tasks via particle filtering. The results of simulation and experimental validation with a hexapod robot are reported to establish generalization of the proposed learning and control algorithms.

References

References
1.
Vapnik
,
V. N.
,
1998
,
Statistical Learning Theory
,
Wiley
,
New York
.
2.
Darema
,
F.
,
2005
, “
Dynamic Data Driven Applications Systems: New Capabilities for Application Simulations and Measurements
,”
Fifth International Conference on Computational Science
(
ICCS
), Atlanta, GA, May 22–25, pp.
610
615
.
3.
Vapnik, V.
, and
Mukherjee, S.
, 2000, “
Support Vector Method for Multivariate Density Estimation
,”
Advances in Neural Information Processing Systems
, Vol. 12, S. A. Solla, T. K. Leen, and K.-R. Muller, eds., MIT Press, Cambridge, MA, pp. 659–665.
4.
Virani
,
N.
,
Lee
,
J.-W.
,
Phoha
,
S.
, and
Ray
,
A.
,
2016
, “
Information-Space Partitioning and Symbolization of Multi-Dimensional Time-Series Data Using Density Estimation
,”
American Control Conference
(
ACC
), Boston, MA, July 6–8, pp.
3328
3333
.
5.
Argall
,
B. D.
,
Chernova
,
S.
,
Veloso
,
M.
, and
Browning
,
B.
,
2009
, “
A Survey of Robot Learning From Demonstration
,”
Rob. Auton. Syst.
,
57
(
5
), pp.
469
483
.
6.
Sutton
,
R. S.
, and
Barto
,
A. G.
,
1998
,
Reinforcement Learning: An Introduction
, Vol.
1
,
MIT Press
,
Cambridge, MA
.
7.
Smart
,
W. D.
, and
Kaelbling
,
L. P.
,
2002
, “
Effective Reinforcement Learning for Mobile Robots
,”
IEEE International Conference on Robotics and Automation
(
ICRA
), Washington, DC, May 11–15, pp.
3404
3410
.
8.
Stolle
,
M.
, and
Atkeson
,
C. G.
,
2007
, “
Knowledge Transfer Using Local Features
,”
IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
(
ADPRL
), Honolulu, HI, Apr. 1–5, pp.
26
31
.
9.
Kuniyoshi
,
Y.
,
Inaba
,
M.
, and
Inoue
,
H.
,
1994
, “
Learning by Watching: Extracting Reusable Task Knowledge From Visual Observation of Human Performance
,”
IEEE Trans. Rob. Autom.
,
10
(
6
), pp.
799
822
.
10.
Chernova
,
S.
, and
Veloso
,
M.
,
2007
, “
Confidence-Based policy Learning From Demonstration Using Gaussian Mixture Models
,”
Sixth International Joint Conference on Autonomous Agents and Multiagent Systems
(
AAMAS
), Honolulu, HI, May 14–18, p.
233
.
11.
Schaal
,
S.
,
Ijspeert
,
A.
, and
Billard
,
A.
,
2003
, “
Computational Approaches to Motor Learning by Imitation
,”
Philos. Trans. R. Soc., B
,
358
(
1431
), pp.
537
547
.
12.
Saunders
,
J.
,
Nehaniv
,
C. L.
, and
Dautenhahn
,
K.
,
2006
, “
Teaching Robots by Moulding Behavior and Scaffolding the Environment
,”
First ACM SIGCHI/SIGART Conference on Human-Robot Interaction
(
HRI
), Salt Lake City, UT, Mar. 2–3, pp.
118
125
.
13.
Ng
,
A. Y.
,
Coates
,
A.
,
Diel
,
M.
,
Ganapathi
,
V.
,
Schulte
,
J.
,
Tse
,
B.
,
Berger
,
E.
, and
Liang
,
E.
,
2006
, “
Autonomous Inverted Helicopter Flight Via Reinforcement Learning
,”
Experimental Robotics IX
,
Springer
,
Berlin
, pp.
363
372
.
14.
Trezza
,
A.
,
Virani
,
N.
,
Wolkowicz
,
K.
,
Moore
,
J.
, and
Brennan
,
S.
,
2015
, “
Indoor Mapping and Localization for a Smart Wheelchair Using Measurements of Ambient Magnetic Fields
,”
ASME
Paper No. DSCC2015-9915.
15.
Smola
,
A. J.
, and
Schlkopf
,
B.
,
2004
, “
A Tutorial on Support Vector Regression
,”
Stat. Comput.
,
14
(
3
), pp.
199
222
.
16.
Tucker
,
H. G.
,
1959
, “
A Generalization of the Glivenko-Cantelli Theorem
,”
Ann. Math. Stat.
,
30
(
3
), pp.
828
830
.
17.
Zou
,
B.
,
Zhang
,
H.
, and
Xu
,
Z.
,
2009
, “
Learning From Uniformly ergodic Markov Chains
,”
J. Complexity
,
25
(
2
), pp.
188
200
.
18.
Karmarkar
,
N.
,
1984
, “
A New Polynomial-Time Algorithm for Linear Programming
,”
16th Annual ACM Symposium on Theory of Computing
(
STOC
), Washington, DC, Apr. 30–May 2, pp.
302
311
.
19.
Bishop
,
C. M.
,
2006
,
Pattern Recognition and Machine Learning
(Information Science and Statistics),
Springer-Verlag
,
New York
.
20.
Arulampalam
,
M. S.
,
Maskell
,
S.
,
Gordon
,
N.
, and
Clapp
,
T.
,
2002
, “
A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking
,”
IEEE Trans. Signal Process.
,
50
(
2
), pp.
174
188
.
21.
Thrun
,
S.
,
Burgard
,
W.
, and
Fox
,
D.
,
2005
,
Probabilistic Robotics
,
MIT Press
,
Cambridge, MA
.
22.
Seto
,
Y.
,
Takahashi
,
N.
,
Jha
,
D. K.
,
Virani
,
N.
, and
Ray
,
A.
,
2016
, “
Data-Driven Robot Gait Modeling Via Symbolic Time Series Analysis
,”
American Control Conference
(
ACC
), Boston, MA, July 6–8, pp.
3904
3909
.
23.
Kroemer
,
O.
,
Van Hoof
,
H.
,
Neumann
,
G.
, and
Peters
,
J.
,
2014
, “
Learning to Predict Phases of Manipulation Tasks as Hidden States
,”
IEEE International Conference on Robotics and Automation
(
ICRA
), Hong Kong, China, May 31–June 7, pp.
4009
4014
.
24.
Grisetti
,
G.
,
Stachniss
,
C.
, and
Burgard
,
W.
,
2007
, “
Improved Techniques for Grid Mapping With Rao-Blackwellized Particle Filters
,”
IEEE Trans. Rob.
,
23
(
1
), pp.
34
45
.
You do not currently have access to this content.