The assembly task is of major difficulty for manufacturing automation. Wherein the peg-in-hole problem represents a group of manipulation tasks that feature continuous motion control in both unconstrained and constrained environments, so that it requires extremely careful consideration to perform with robots. In this work, we adapt the ideas underlying the success of human to manipulation tasks, variable compliance and learning, for robotic assembly. Based on sensing the interaction between the peg and the hole, the proposed controller can switch the operation strategy between passive compliance and active regulation in continuous spaces, which outperforms the fixed compliance controllers. Experimental results show that the robot is able to learn a proper stiffness strategy along with the trajectory policy through trial and error. Further, this variable compliance policy proves robust to different initial states and it is able to generalize to more complex situation.

References

References
1.
Chang
,
R. J.
,
Lin
,
C. Y.
, and
Lin
,
P. S.
,
2011
, “
Visual-Based Automation of Peg-in-Hole Microassembly Process
,”
ASME J. Manuf. Sci. Eng.
,
133
(
4
), p.
041015
.
2.
Yoshimi
,
B. H.
, and
Allen
,
P. K.
,
1994
, “
Active, Uncalibrated Visual Servoing
,”
IEEE
International Conference on Robotics and Automation,
San Diego, CA, May 8–13, pp.
156
161
.
3.
Burdet
,
E.
, and
Nuttin
,
M.
,
1999
, “
Learning Complex Tasks Using a Stepwise Approach
,”
J. Intell. Rob. Syst.
,
24
(
1
), pp.
43
68
.
4.
Usubamatov
,
R.
, and
Leong
,
K. W.
,
2011
, “
Analyses of Peg-Hole Jamming in Automatic Assembly Machines
,”
Assem. Autom.
,
31
(
4
), pp.
358
362
.
5.
Jasim
,
I. F.
, and
Plapper
,
P. W.
,
2014
, “
Contact-State Monitoring of Force-Guided Robotic Assembly Tasks Using Expectation Maximization-Based Gaussian Mixtures Models
,”
Int. J. Adv. Manuf. Technol.
,
73
(
5–8
), pp.
623
633
.
6.
Zhang
,
K.
,
Shi
,
M. H.
,
Xu
,
J.
,
Liu
,
F.
, and
Chen
,
K.
,
2017
, “
Force Control for a Rigid Dual Peg-in-Hole Assembly
,”
Assem. Autom.
,
37
(
2
), pp.
200
207
.
7.
Inoue
,
T.
,
Magistris
,
G. D.
,
Munawar
,
A.
,
Yokoya
,
T.
, and
Tachibana
,
R.
,
2017
, “
Deep Reinforcement Learning for High Precision Assembly Tasks
,” IEEE/RSJ International Conference on Intelligent Robots and Systems (
IROS
), Vancouver, BC, Sept. 24–28, pp. 819–825.
8.
Whitney
,
D. E.
,
1982
, “
Quasi-Static Assembly of Compliantly Supported Rigid Parts
,”
ASME J. Dyn. Syst., Meas., Control
,
104
(
1
), pp.
65
77
.
9.
Yun
,
S. K.
,
2008
, “
Compliant Manipulation for Peg-in-Hole: Is Passive Compliance a Key to Learn Contact Motion?
,”
IEEE
International Conference on Robotics and Automation
, Pasadena, CA, May 19–23, pp.
1647
1652
.
10.
Ganesh
,
G.
,
Albu-Schäffer
,
A.
,
Haruno
,
M.
,
Kawato
,
M.
, and
Burdet
,
E.
,
2010
, “
Biomimetic Motor Behavior for Simultaneous Adaptation of Force, Impedance and Trajectory in Interaction Tasks
,”
IEEE
International Conference on Robotics and Automation
, Anchorage, AK, May 3–7, pp.
2705
2711
.
11.
Franklin
,
D. W.
,
Burdet, E.
,
Tee, K. P.
,
Osu, R.
,
Chew, C. M.
,
Milner T. E.
, and
Kawato, M.
,
2008
, “
CNS Learns Stable, Accurate, and Efficient Movements Using a Simple Algorithm
,”
J. Neurosci. Official J. Soc. Neurosci.
,
28
(
44
), pp.
11165
11173
.
12.
Richard S. S.
, and
Barto, A. G.
,
1998
,
Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)
, MIT Press, Cambridge, MA.
13.
Kober
,
J.
, and
Peters
,
J.
,
2013
, “
Reinforcement Learning in Robotics: A Survey
,”
Int. J. Rob. Res.
,
32
(
11
), pp.
1238
1274
.
14.
Buchli
,
J.
,
Stulp
,
F.
,
Theodorou
,
E.
, and
Schaal
,
S.
,
2011
, “
Learning Variable Impedance Control
,”
Int. J. Rob. Res.
,
30
(
7
), pp.
820
833
.
15.
Abu-Dakka
,
F. J.
,
Nemec
,
B.
,
Jørgensen
,
J. A.
,
Savarimuthu
,
T. R.
,
Krüger
,
N.
, and
Ude
,
A.
,
2015
, “
Adaptation of Manipulation Skills in Physical Contact With the Environment to Reference Force Profiles
,”
Auton. Robots
,
39
(
2
), pp.
199
217
.
16.
Pastor
,
P.
,
Kalakrishnan
,
M.
,
Chitta
,
S.
, and
Theodorou
,
E.
,
2011
, “
Skill Learning and Task Outcome Prediction for Manipulation
,”
IEEE
International Conference on Robotics and Automation
, Shanghai, China, May 9–13, pp.
3828
3834
.
17.
Kormushev
,
P.
,
Calinon
,
S.
, and
Caldwell
,
D. G.
,
2010
, “
Robot Motor Skill Coordination With EM-Based Reinforcement Learning
,”
IEEE/RSJ
International Conference on Intelligent Robots and Systems
, Taipei, Taiwan, Oct. 18–22, pp.
3232
3237
.
18.
Mnih
,
V.
,
Kavukcuoglu, K.
,
Silver, D.
,
Rusu, A. A.
,
Veness, J.
,
Bellemare, M. G.
,
Graves, A.
,
Riedmiller, M.
,
Fidjeland, K. A.
,
Ostrovski, G.
,
Petersen, S.
,
Beattie, C.
,
Sadik, A.
,
Antonoglou, I.
,
King, H.
,
Kumaran, D.
,
Wierstra, D.
,
Legg, S.
, and
Hassabis, D.
,
2015
, “
Human-Level Control Through Deep Reinforcement Learning
,”
Nature
,
518
(
7540
), p.
529
.
19.
Silver
,
D.
,
Lever
,
G.
,
Heess
,
N.
,
Degris
,
T.
,
Wierstra
,
D.
, and
Riedmiller
,
M.
,
2014
, “
Deterministic Policy Gradient Algorithms
,”
International Conference on International Conference on Machine Learning
, Beijing, China, June 21–26, pp.
387
395
.
20.
Lillicrap
,
T. P.
,
Hunt, J. J.
,
Pritzel, A.
,
Heess, N.
,
Erez, T.
,
Tassa, Y.
,
Silver, D.
, and
Wierstra, D.
,
2015
, “
Continuous Control With Deep Reinforcement Learning
,” eprint:
arXiv:1509.02971
.https://arxiv.org/abs/1509.02971
21.
Sharma
,
K.
,
Shirwalkar
,
V.
, and
Pal
,
P. K.
,
2014
, “
Intelligent and Environment-Independent Peg-In-Hole Search Strategies
,”
International Conference on Control, Automation, Robotics and Embedded Systems
(
CARE
), Jabalpur, India, Dec. 16–18, pp.
1
6
.
22.
Hogan
,
N.
,
1985
, “
Impedance Control: An Approach to Manipulation—Part I: Theory
,”
ASME J. Dyn. Syst., Meas., Control
,
107
(
1
), pp.
481
489
.
23.
Albu-Schäffer
,
A.
, and
Hirzinger
,
G.
,
2003
, “
Cartesian Compliant Control Strategies for Light-Weight, Flexible Joint Robots
,”
Control Problems in Robotics
, Bicchi, A., Christensen, H., and Prattichizzo, D., eds., Springer, Berlin, Heidelberg, pp 135–151.
24.
Drake
,
S. H.
,
1977
, “
Using Compliance in Lieu of Sensory Feedback for Automatic Assembly
,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA.
25.
Peters
,
J.
,
Vijayakumar
,
S.
, and
Schaal
,
S.
,
2005
, “
Natural Actor-Critic
,”
European Conference on Machine Learning
, pp.
280
291
.
26.
Eberman
,
B. S.
, and
Salisbury
,
K.
,
1989
, “
Whole-Arm Manipulation: Kinematics and Control
,”
MSc thesis
, Massachusetts Institute of Technology, Cambridge, MA.https://dspace.mit.edu/handle/1721.1/14428
27.
Ren
,
T.
,
Dong
,
Y.
,
Wu
,
D.
,
Wang
,
G.
, and
Chen
,
K.
,
2018
, “
Collision Detection and Identification for Robot Manipulators Based on Extended State Observer
,”
Control Eng. Pract.
,
79
, pp.
144
153
.
28.
Kingma
,
D. P.
, and
Ba
,
J.
,
2014
, “
Adam: A Method for Stochastic Optimization
,” eprint:
arXiv:1412.6980
https://arxiv.org/abs/1412.6980.
You do not currently have access to this content.