Graphical Abstract Figure

Overall structure of the proposed adaptive robot motion planning approach

Graphical Abstract Figure

Overall structure of the proposed adaptive robot motion planning approach

Close modal

Abstract

Advanced motion planning is crucial for safe and efficient robotic operations in various scenarios of smart manufacturing, such as assembling, packaging, and palletizing. Compared to traditional motion planning methods, Reinforcement Learning (RL) shows better adaptability to complex and dynamic working environments. However, the training of RL models is often time-consuming, and the determination of well-behaved reward function parameters is challenging. To tackle these issues, an adaptive robot motion planning approach is proposed based on digital twin and reinforcement learning. The core idea is to adaptively select geometry-based or RL-based methods for robot motion planning through a real-time distance detection mechanism, which can reduce the complexity of RL model training and accelerate the training process. In addition, Bayesian Optimization is integrated within RL training to refine the reward function parameters. The approach is validated with a Digital Twin-enabled robot system through five kinds of tasks (Pick and Place, Drawer Open, Light Switch, Button Press, and Cube Push) in dynamic environments. Experiment results show that our approach outperforms the traditional RL-based method with improved training speed and guaranteed task performance. This work contributes to the practical deployment of adaptive robot motion planning in smart manufacturing.

References

1.
Flowers
,
J.
, and
Wiens
,
G.
,
2023
, “
A Spatio-Temporal Prediction and Planning Framework for Proactive Human–Robot Collaboration
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121011
.
2.
Park
,
J.
,
Han
,
C.
,
Jun
,
M. B. G.
, and
Yun
,
H.
,
2023
, “
Autonomous Robotic Bin Picking Platform Generated From Human Demonstration and YOLOv5
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121006
.
3.
Fan
,
J.
,
Zheng
,
P.
, and
Lee
,
C. K. M.
,
2023
, “
A Vision-Based Human Digital Twin Modeling Approach for Adaptive Human–Robot Collaboration
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121002
.
4.
Ma
,
X.
,
Qi
,
Q.
, and
Tao
,
F.
,
2024
, “
A Digital Twin–Based Environment-Adaptive Assignment Method for Human–Robot Collaboration
,”
ASME J. Manuf. Sci. Eng.
,
146
(
3
), p.
031004
.
5.
Jafarzadeh
,
H.
, and
Fleming
,
C. H.
,
2018
, “
An Exact Geometry–Based Algorithm for Path Planning
,”
Int. J. Appl. Math. Comput. Sci.
,
28
(
3
), pp.
493
504
.
6.
Rasekhipour
,
Y.
,
Khajepour
,
A.
,
Chen
,
S.-K.
, and
Litkouhi
,
B.
,
2017
, “
A Potential Field-Based Model Predictive Path-Planning Controller for Autonomous Road Vehicles
,”
IEEE Trans. Intell. Transp. Syst.
,
18
(
5
), pp.
1255
1267
.
7.
Gul
,
F.
,
Mir
,
I.
,
Alarabiat
,
D.
,
Alabool
,
H. M.
,
Abualigah
,
L.
, and
Mir
,
S.
,
2022
, “
Implementation of Bio-Inspired Hybrid Algorithm with Mutation Operator for Robotic Path Planning
,”
J. Parallel Distrib. Comput.
,
169
, pp.
171
184
.
8.
Karur
,
K.
,
Sharma
,
N.
,
Dharmatti
,
C.
, and
Siegel
,
J. E.
,
2021
, “
A Survey of Path Planning Algorithms for Mobile Robots
,”
Vehicles
,
3
(
3
), pp.
448
468
.
9.
LaValle
,
S. M.
, and
Kuffner Jr
,
J. J.
,
2001
, “
Randomized Kinodynamic Planning
,”
Int. J. Rob. Res.
,
20
(
5
), pp.
378
400
.
10.
Elbanhawi
,
M.
, and
Simic
,
M.
,
2014
, “
Sampling-Based Robot Motion Planning: A Review
,”
IEEE Access
,
2
, pp.
56
77
.
11.
Wang
,
B.
,
Liu
,
Z.
,
Li
,
Q.
, and
Prorok
,
A.
,
2020
, “
Mobile Robot Path Planning in Dynamic Environments Through Globally Guided Reinforcement Learning
,”
IEEE Robot. Autom. Lett.
,
5
(
4
), pp.
6932
6939
.
12.
Waseem
,
M.
, and
Chang
,
Q.
,
2023
, “
Adaptive Mobile Robot Scheduling in Multiproduct Flexible Manufacturing Systems Using Reinforcement Learning
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121005
.
13.
Xiao
,
J.
,
Gao
,
J.
,
Anwer
,
N.
, and
Eynard
,
B.
,
2023
, “
Multi-Agent Reinforcement Learning Method for Disassembly Sequential Task Optimization Based on Human–Robot Collaborative Disassembly in Electric Vehicle Battery Recycling
,”
ASME J. Manuf. Sci. Eng.
,
145
(
12
), p.
121001
.
14.
Matulis
,
M.
, and
Harvey
,
C.
,
2021
, “
A Robot Arm Digital Twin Utilising Reinforcement Learning
,”
Comput. Graph.
,
95
, pp.
106
114
.
15.
Li
,
T.
,
Lambert
,
N.
,
Calandra
,
R.
,
Meier
,
F.
, and
Rai
,
A.
,
2020
, “
Learning Generalizable Locomotion Skills With Hierarchical Reinforcement Learning
,”
Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA)
,
Paris, France
,
May 31–Aug. 31
, pp.
413
419
.
16.
Adiyatov
,
O.
, and
Varol
,
H. A.
,
2017
, “
A Novel RRT*-Based Algorithm for Motion Planning in Dynamic Environments
,”
Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation (ICMA)
,
Takamatsu, Japan
,
Aug. 6–9
, pp.
1416
1421
.
17.
Chen
,
Y.
,
Liu
,
M.
, and
Wang
,
L.
,
2018
, “
RRT* Combined With GVO for Real-Time Nonholonomic Robot Navigation in Dynamic Environment
,”
Proceedings of the 2018 IEEE International Conference on Real-Time Computing and Robotics (RCAR)
,
Kandima, Maldives
,
Aug. 1–5
, pp.
479
484
.
18.
Qi
,
J.
,
Yang
,
H.
, and
Sun
,
H.
,
2021
, “
MOD-RRT*: A Sampling-Based Algorithm for Robot Path Planning in Dynamic Environment
,”
IEEE Trans. Ind. Electron.
,
68
(
8
), pp.
7244
7251
.
19.
Tamizi
,
M. G.
,
Yaghoubi
,
M.
, and
Najjaran
,
H.
,
2023
, “
A Review of Recent Trend in Motion Planning of Industrial Robots
,”
Int. J. Intell. Robot. Appl.
,
7
(
2
), pp.
253
274
.
20.
Wang
,
X.
,
Fu
,
H.
,
Deng
,
G.
,
Liu
,
C.
,
Tang
,
K.
, and
Chen
,
C.
,
2023
, “
Hierarchical Free Gait Motion Planning for Hexapod Robots Using Deep Reinforcement Learning
,”
IEEE Trans. Ind. Inform.
,
19
(
11
), pp.
10901
10912
.
21.
Zhang
,
Y.
, and
Chen
,
P.
,
2023
, “
Path Planning of a Mobile Robot for a Dynamic Indoor Environment Based on an SAC-LSTM Algorithm
,”
Sensors
,
23
(
24
), p.
9802
.
22.
Schmitt
,
P. S.
,
Wirnshofer
,
F.
,
Wurm
,
K. M.
,
Wichert
,
G. V.
, and
Burgard
,
W.
,
2019
, “
Planning Reactive Manipulation in Dynamic Environments
,”
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE
,
Macau, China
,
Nov. 4–8
,
IEEE
, pp.
136
143
.
23.
Chai
,
R.
,
Niu
,
H.
,
Carrasco
,
J.
,
Arvin
,
F.
,
Yin
,
H.
, and
Lennox
,
B.
,
2024
, “
Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment
,”
IEEE Trans. Neural Netw. Learn. Syst.
,
35
(
4
), pp.
5778
5792
.
24.
Chen
,
P.
,
Pei
,
J.
,
Lu
,
W.
, and
Li
,
M.
,
2022
, “
A Deep Reinforcement Learning Based Method for Real-Time Path Planning and Dynamic Obstacle Avoidance
,”
Neurocomputing
,
497
, pp.
64
75
.
25.
Zhou
,
Q.
,
Li
,
S.
,
Qu
,
J.
,
Wu
,
J.
,
Xu
,
H.
, and
Bi
,
Y.
,
2023
, “
An Adaptive Path Planning Approach for Digital Twin-Enabled Robot Arm Based on Inverse Kinematics and Deep Reinforcement Learning
,”
Proceedings of the 2023 ASME International Mechanical Engineering Congress and Exposition
,
New Orleans, LA
,
Oct. 29–Nov. 2
, p.
V003T03A079
.
26.
Luipers
,
D.
,
Kaulen
,
N.
,
Chojnowski
,
O.
,
Schneider
,
S.
,
Richert
,
A.
, and
Jeschke
,
S.
,
2022
, “
Robot Control Using Model-Based Reinforcement Learning With Inverse Kinematics
,”
Proceedings of the 2022 IEEE International Conference on Development and Learning (ICDL)
,
London, UK
,
Sept. 12–15
,
IEEE
, pp.
244
249
.
27.
Zhong
,
J.
,
Wang
,
T.
, and
Cheng
,
L.
,
2022
, “
Collision-Free Path Planning for Welding Manipulator via Hybrid Algorithm of Deep Reinforcement Learning and Inverse Kinematics
,”
Complex Intell. Syst.
,
8
(
3
), pp.
1899
1912
.
28.
Li
,
X.
,
Liu
,
H.
, and
Dong
,
M.
,
2022
, “
A General Framework of Motion Planning for Redundant Robot Manipulator Based on Deep Reinforcement Learning
,”
IEEE Trans. Ind. Inform.
,
18
(
8
), pp.
5253
5263
.
29.
Faust
,
A.
,
Ramirez
,
O.
,
Fiser
,
M.
,
Oslund
,
K.
,
Francis
,
A.
,
Davidson
,
J.
, and
Tapia
,
L.
,
2018
, “
PRM-RL: Long-Range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-Based Planning
,”
Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA)
,
Brisbane, QLD, Australia
,
May 21–25
, pp.
5113
5120
.
30.
Arora
,
S.
, and
Doshi
,
P.
,
2021
, “
A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress
,”
Artif. Intell.
,
297
, p.
103500
.
31.
Kim
,
B.
, and
Pineau
,
J.
,
2016
, “
Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning
,”
Int. J. Soc. Robot.
,
8
(
1
), pp.
51
66
.
32.
Tucker
,
A.
,
Gleave
,
A.
, and
Russell
,
S.
,
2018
, “Inverse Reinforcement Learning for Video Games.” arXiv preprint arXiv:1810.10593, 2018.
33.
Frazier
,
P. I.
,
2018
, “A Tutorial on Bayesian Optimization.” arXiv preprint arXiv:1807.02811.
34.
Wilson
,
A.
,
Fern
,
A.
, and
Tadepalli
,
P.
,
2014
, “
Using Trajectory Data to Improve Bayesian Optimization for Reinforcement Learning
,”
J. Mach. Learn. Res.
,
15
(
1
), pp.
253
282
.
35.
Young
,
M. T.
,
Hinkle
,
J. D.
,
Kannan
,
R.
, and
Ramanathan
,
A.
,
2020
, “
Distributed Bayesian Optimization of Deep Reinforcement Learning Algorithms
,”
J. Parallel Distrib. Comput.
,
139
, pp.
43
52
.
36.
Gong
,
S.
,
Wang
,
M.
,
Gu
,
B.
,
Zhang
,
W.
,
Hoang
,
D. T.
, and
Niyato
,
D.
,
2023
, “
Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks
,”
IEEE Trans. Veh. Technol.
,
72
(
8
), pp.
10933
10948
.
37.
Cai
,
D.
,
Heikkia
,
J.
, and
Rahtu
,
E.
,
2022
, “
OVE6D: Object Viewpoint Encoding for Depth-Based 6D Object Pose Estimation
,”
Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
,
New Orleans, LA
,
June 19–24
,
IEEE
, pp.
6793
6803
.
38.
Denavit
,
J.
, and
Hartenberg
,
R. S.
,
1955
, “
A Kinematic Notation for Lower-Pair Mechanisms Based on Matrices
,”
J. Appl. Mech.
,
22
(
2
), pp.
215
221
.
39.
Littman
,
M. L.
,
1994
, “
Markov Games as a Framework for Multi-Agent Reinforcement Learning
,”
Proceedings of the 1994 Machine Learning Proceedings
,
San Francisco, CA
,
July 10–13
, pp.
157
163
.
40.
Schulman
,
J.
,
Wolski
,
F.
,
Dhariwal
,
P.
,
Radford
,
A.
, and
Klimov
,
O.
,
2017
, “Proximal Policy Optimization Algorithms.” arXiv preprint arXiv:1707.06347.
41.
Li
,
J.
,
Pang
,
D.
,
Zheng
,
Y.
,
Guan
,
X.
, and
Le
,
X.
,
2022
, “
A Flexible Manufacturing Assembly System With Deep Reinforcement Learning
,”
Control Eng. Pract.
,
118
, p.
104957
.
42.
Shahriari
,
B.
,
Swersky
,
K.
,
Wang
,
Z.
,
Adams
,
R. P.
, and
De Freitas
,
N.
,
2016
, “
Taking the Human Out of the Loop: A Review of Bayesian Optimization
,”
Proc. IEEE
,
104
(
1
), pp.
148
175
.
43.
Belete
,
D. M.
, and
Huchaiah
,
M. D.
,
2022
, “
Grid Search in Hyperparameter Optimization of Machine Learning Models for Prediction of HIV/AIDS Test Results
,”
Int. J. Comput. Appl.
,
44
(
9
), pp.
875
886
.
44.
Bergstra
,
J.
, and
Bengio
,
Y.
,
2012
, “
Random Search for Hyper-Parameter Optimization
,”
J. Mach. Learn. Res.
,
13
(
2
), pp.
281
305
.
45.
Haarnoja
,
T.
,
Zhou
,
A.
,
Abbeel
,
P.
, and
Levine
,
S.
,
2018
, “
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning With a Stochastic Actor
,”
Proceedings of the 2018 International Conference on Machine Learning (PMLR)
,
Stockholm, Sweden
,
July 10–15
, pp.
1861
1870
.
You do not currently have access to this content.