Abstract

Efficient exploration of design spaces is highly sought after in engineering applications. A spectrum of tools has been proposed to deal with the computational difficulties associated with such problems. In the context of our case study, these tools can be broadly classified into optimization and supervised learning approaches. Optimization approaches, while successful, are inherently data inefficient, with evolutionary optimization-based methods being a good example. This inefficiency stems from data not being reused from previous design explorations. Alternately, supervised learning-based design paradigms are data efficient. However, the quality of ensuing solutions depends heavily on the quality of data available. Furthermore, it is difficult to incorporate physics models and domain knowledge aspects of design exploration into pure-learning-based methods. In this work, we formulate a reinforcement learning (RL)-based design framework that mitigates disadvantages of both approaches. Our framework simultaneously finds solutions that are more efficient compared with supervised learning approaches while using data more efficiently compared with genetic algorithm (GA)-based optimization approaches. We illustrate our framework on a problem of microfluidic device design for flow sculpting, and our results show that a single generic RL agent is capable of exploring the solution space to achieve multiple design objectives. Additionally, we demonstrate that the RL agent can be used to solve more complex problems using a targeted refinement step. Thus, we address the data efficiency limitation of optimization-based methods and the limited data problem of supervised learning-based methods. The versatility of our framework is illustrated by utilizing it to gain domain insights and to incorporate domain knowledge. We envision such RL frameworks to have an impact on design science.

References

References
1.
Xu
,
H.
,
Liu
,
R.
,
Choudhary
,
A.
, and
Chen
,
W.
,
2015
, “
A Machine Learning-Based Design Representation Method for Designing Heterogeneous Microstructures
,”
ASME J. Mech. Des.
,
137
(
5
), p.
051403
. 10.1115/1.4029768
2.
Nellippalli
,
A.
,
Rangaraj
,
V.
,
Gautham
,
B.
,
Singh
,
A.
,
Allen
,
J.
, and
Mistree
,
F.
,
2018
, “
An Inverse, Decision-Based Design Method for Integrated Design Exploration of Materials, Products, and Manufacturing Processes
,”
ASME J. Mech. Des.
,
140
(
11
), p.
111403
. 10.1115/1.4041050
3.
Molesky
,
S.
,
Lin
,
Z.
,
Piggott
,
A. Y.
,
Jin
,
W.
,
Vucković
,
J.
, and
Rodrigue
,
A. W.
,
2018
, “
Inverse Design in Nanophotonics
,”
Nat. Photonics
,
12
(
11
), pp.
659
670
. 10.1038/s41566-018-0246-9
4.
Smyl
,
D.
,
2018
, “
An Inverse Method for Optimizing Elastic Properties Considering Multiple Loading Conditions and Displacement Criteria
,”
ASME J. Mech. Des.
,
140
(
11
), p.
111411
. 10.1115/1.4040788
5.
Sanchez-Lengeling
,
B.
, and
Aspuru-Guzik
,
A.
,
2018
, “
Inverse Molecular Design Using Machine Learning: Generative Models for Matter Engineering
,”
Science
,
361
(
6400
), pp.
360
365
. 10.1126/science.aat2663
6.
Stoecklein
,
D.
,
Wu
,
C.-Y.
,
Kim
,
D.
,
Di Carlo
,
D.
, and
Ganapathysubramanian
,
B.
,
2016
, “
Optimization of Micropillar Sequences for Fluid Flow Sculpting
,”
Phys. Fluids
,
28
(
1
), p.
012003
. 10.1063/1.4939512
7.
Stoecklein
,
D.
,
Davies
,
M.
,
Wubshet
,
N.
,
Le
,
J.
, and
Ganapathysubramanian
,
B.
,
2017
, “
Automated Design for Microfluid Flow Sculpting: Multiresolution Approaches, Efficient Encoding, and CUDA Implementation
,”
ASME J. Fluids Eng.
,
139
(
3
), pp.
1
11
. 10.1115/1.4034953
8.
Stanley
,
R.
,
2017
, “
Efficient Mechanical Design and Limit Cycle Stability for a Humanoid Robot: An Application of Genetic Algorithms
,”
Neurocomputing
,
233
(
SI: CCE 2015
), pp.
72
80
. 10.1016/j.neucom.2016.08.113
9.
Stoecklein
,
D.
,
Wu
,
C.-Y.
,
Owsley
,
K.
,
Xie
,
Y.
,
Di Carlo
,
D.
, and
Ganapathysubramanian
,
B.
,
2014
, “
Micropillar Sequence Designs for Fundamental Inertial Flow Transformations
,”
Lab Chip
,
14
(
21
), pp.
4197
4204
. 10.1039/C4LC00653D
10.
Carrera
,
G. V.
,
Branco
,
L. C.
,
Aires-de Sousa
,
J.
, and
Afonso
,
C. A.
,
2008
, “
Exploration of Quantitative Structure–Property Relationships (QSPR) for the Design of New Guanidinium Ionic Liquids
,”
Tetrahedron
,
64
(
9
), pp.
2216
2224
. 10.1016/j.tet.2007.12.021
11.
Jia
,
N.
, and
Lam
,
E. Y.
,
2010
, “
Machine Learning for Inverse Lithography: Using Stochastic Gradient Descent for Robust Photomask Synthesis
,”
J. Opt.
,
12
(
4
), p.
045601
. 10.1088/2040-8978/12/4/045601
12.
Liu
,
D.
,
Tan
,
Y.
,
Khoram
,
E.
, and
Yu
,
Z.
,
2018
, “
Training Deep Neural Networks for the Inverse Design of Nanophotonic Structures
,”
ACS Photonics
,
5
(
4
), pp.
1365
1369
. 10.1021/acsphotonics.7b01377
13.
Zhang
,
Z.
,
Zhou
,
Z.
, and
Shen
,
D.
,
2013
, “
Sample Selection Method in Supervised Learning Based on Adaptive Estimated Threshold
,”
2013 International Conference on Machine Learning and Cybernetics
,
Tian Jin, China
,
July 14–17
, Vol.
4
, pp.
1861
1864
.
14.
Cui
,
H.
,
Turan
,
O.
, and
Sayer
,
P.
,
2012
, “
Learning-Based Ship Design Optimization Approach
,”
Comput. Aided Des.
,
44
(
3
), pp.
186
195
. 10.1016/j.cad.2011.06.011
15.
Yonekura
,
K.
, and
Hattori
,
H.
,
2019
, “
Framework for Design Optimization Using Deep Reinforcement Learning
,”
Struct. Multidiscipl. Optim.
,
60
, pp.
1
5
. 10.1007/s00158-019-02276-w
16.
Dulac-Arnold
,
G.
,
Evans
,
R.
,
van Hasselt
,
H.
,
Sunehag
,
P.
,
Lillicrap
,
T.
,
Hunt
,
J.
,
Mann
,
T.
,
Weber
,
T.
,
Degris
,
T.
, and
Coppin
,
B.
,
2015
, “
Deep Reinforcement Learning in Large Discrete Action Spaces
,” e-print arXiv:1512.07679.
17.
Tavakoli
,
A.
,
Pardo
,
F.
, and
Kormushev
,
P.
,
2018
, “
Action Branching Architectures for Deep Reinforcement Learning
,”
Thirty-Second AAAI Conference on Artificial Intelligence
,
New Orleans, LA
,
Feb. 2–7
.
18.
Amini
,
H.
,
Sollier
,
E.
,
Masaeli
,
M.
,
Xie
,
Y.
,
Ganapathysubramanian
,
B.
,
Stone
,
H. A.
, and
Di Carlo
,
D.
,
2013
, “
Engineering Fluid Flow Using Sequenced Microstructures
,”
Nat. Commun.
,
4
, pp.
1826
. 10.1038/ncomms2841
19.
Nunes
,
J. K.
,
Wu
,
C. Y.
,
Amini
,
H.
, and
Owsley
,
K.
,
2014
, “
Fabricating Shaped Microfibers With Inertial Microfluidics
,”
Adv. Mater.
,
26
(
22
), pp.
3712
3717
. 10.1002/adma.v26.22
20.
Wu
,
C. Y.
,
Owsley
,
K.
,
Di Carlo
,
D.
,
2015
, “
Rapid Software-Based Design and Optical Transient Liquid Molding of Microparticles
,”
Adv. Mater.
,
27
(
48
), pp.
7970
7978
. 10.1002/adma.201503308
21.
Paulsen
,
K. S.
,
Di Carlo
,
D.
, and
Chung
,
A. J.
,
2015
, “
Optofluidic Fabrication for 3D-Shaped Particles
,”
Nat. Commun.
,
6
, pp.
6976
. 10.1038/ncomms7976
22.
Paulsen
,
K. S.
, and
Chung
,
A. J.
,
2016
, “
Non-Spherical Particle Generation From 4D Optofluidic Fabrication
,”
Lab Chip
,
16
(
16
), pp.
2987
2995
. 10.1039/C6LC00208K
23.
Paulsen
,
K. S.
,
Deng
,
Y.
, and
Chung
,
A. J.
,
2018
, “
DIY 3D Microparticle Generation From Next Generation Optofluidic Fabrication
,”
Adv. Sci.
,
5
(
7
), pp.
1
6
. 10.1002/advs.201800252
24.
Sollier
,
E.
,
Amini
,
H.
,
Go
,
D. E.
,
Sandoz
,
P. A.
,
Owsley
,
K.
, and
Di Carlo
,
D.
,
2015
, “
Inertial Microfluidic Programming of Microparticle-Laden Flows for Solution Transfer Around Cells and Particles
,”
Microfluidics Nanofluidics
,
19
(
1
), pp.
53
65
. 10.1007/s10404-015-1547-7
25.
Chung
,
A. J.
,
Pulido
,
D.
,
Oka
,
J. C.
,
Amini
,
H.
,
Masaeli
,
M.
, and
Di Carlo
,
D.
,
2013
, “
Microstructure-Induced Helical Vortices Allow Single-Stream and Long-Term Inertial Focusing
,”
Lab Chip
,
13
(
15
), pp.
2942
2949
. 10.1039/c3lc41227j
26.
Stoecklein
,
D.
,
Owsley
,
K.
,
Wu
,
C.-Y.
,
Di Carlo
,
D.
, and
Ganapathysubramanian
,
B.
,
2018
, “
uFlow: Software for Rational Engineering of Secondary Flows in Inertial Microfluidic Devices
,”
Microfluidics Nanofluidics
,
22
(
7
), pp.
74
. 10.1007/s10404-018-2093-x
27.
Fourestey
,
G.
, and
Moubachir
,
M.
,
2005
, “
Solving Inverse Problems Involving the Navier–Stokes Equations Discretized by a Lagrange–Galerkin Method
,”
Comput. Methods Appl. Mech. Eng.
,
194
(
6–8
), pp.
877
906
. 10.1016/j.cma.2004.07.006
28.
Walker
,
S. W.
, and
Shelley
,
M. J.
,
2010
, “
Shape Optimization of Peristaltic Pumping
,”
J. Comput. Phys.
,
229
(
4
), pp.
1260
1291
. 10.1016/j.jcp.2009.10.030
29.
Lore
,
K. G.
,
Stoecklein
,
D.
,
Davies
,
M.
,
Ganapathysubramanian
,
B.
, and
Sarkar
,
S.
,
2015
, “
Hierarchical Feature Extraction for Efficient Design of Microfluidic Flow Patterns
,”
NeurIPS
,
Montreal, Canada
,
Dec. 7–12
, pp.
213
225
.
30.
Stoecklein
,
D.
,
Lore
,
K. G.
,
Davies
,
M.
,
Sarkar
,
S.
, and
Ganapathysubramanian
,
B.
,
2017
, “
Deep Learning for Flow Sculpting: Insights Into Efficient Learning Using Scientific Simulation Data
,”
Sci. Rep.
,
7
, pp.
46368
. 10.1038/srep46368
31.
Lore
,
K. G.
,
Stoecklein
,
D.
,
Davies
,
M.
,
Ganapathysubramanian
,
B.
, and
Sarkar
,
S.
,
2018
, “
A Deep Learning Framework for Causal Shape Transformation
,”
Neural Netw.
,
98
, pp.
305
317
. 10.1016/j.neunet.2017.12.003
32.
Lee
,
X. Y.
,
Balu
,
A.
,
Stoecklein
,
D.
,
Ganapathysubramanian
,
B.
, and
Sarkar
,
S.
,
2018
, “
Flow Shape Design for Microfluidic Devices Using Deep Reinforcement Learning
,” e-print arXiv:1811.12444.
33.
Brockman
,
G.
,
Cheung
,
V.
,
Pettersson
,
L.
,
Schneider
,
J.
,
Schulman
,
J.
,
Tang
,
J.
, and
Zaremba
,
W.
,
2016
, “
Openai Gym
,” e-print arXiv:1606.01540.
34.
Hasselt
,
H. V.
,
Guez
,
A.
, and
Silver
,
D.
,
2016
, “
Deep Reinforcement Learning With Double Q-Learning
,”
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16
,
Phoenix, AZ
,
Feb. 12–17
, AAAI Press, pp.
2094
2100
.
35.
Sutton
,
R. S.
,
McAllester
,
D.
,
Singh
,
S.
, and
Mansour
,
Y.
,
2000
, “
Policy Gradient Methods for Reinforcement Learning With Function Approximation
,”
International Conference on Neural Information Processing Systems
,
Denver, CO
,
Nov. 29–Dec. 4
, Vol.
12
, MIT Press, pp.
1057
1063
.
36.
Schulman
,
J.
,
Levine
,
S.
,
Abbeel
,
P.
,
Jordan
,
M. I.
, and
Moritz
,
P.
,
2015
, “
Trust Region Policy Optimization
,”
International Conference on Machine Learning
,
Lille, France
,
July 6–11
, vol.
37
, pp.
1889
1897
.
37.
Silver
,
D.
,
Lever
,
G.
,
Heess
,
N.
,
Degris
,
T.
,
Wierstra
,
D.
, and
Riedmiller
,
M.
,
2014
, “
Deterministic Policy Gradient Algorithms
,”
International Conference on Machine Learning
,
Beijing, China
,
June 21–26
, vol.
32
, pp.
387
395
.
38.
Nagabandi
,
A.
,
Kahn
,
G.
,
Fearing
,
R. S.
, and
Levine
,
S.
,
2018
, “
Neural Network Dynamics for Model-Based Deep Reinforcement Learning With Model-Free Fine-Tuning
,”
2018 IEEE International Conference on Robotics and Automation (ICRA)
,
Brisbane, Australia
,
May 21–25
, IEEE, pp.
7559
7566
.
39.
Feinberg
,
V.
,
Wan
,
A.
,
Stoica
,
I.
,
Jordan
,
M. I.
,
Gonzalez
,
J. E.
, and
Levine
,
S.
,
2018
, “
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
,” e-print arXiv:1803.00101.
40.
Mnih
,
V.
,
Kavukcuoglu
,
K.
,
Silver
,
D.
,
Rusu
,
A. A.
,
Veness
,
J.
,
Bellemare
,
M. G.
,
Graves
,
A.
,
Riedmiller
,
M.
,
Fidjeland
,
A. K.
,
Ostrovski
,
G.
,
Petersen
,
S.
,
Beattie
,
C.
,
Sadik
,
A.
,
Antonoglou
,
I.
,
King
,
H.
,
Kumaran
,
D.
,
Wierstra
,
D.
,
Legg
,
S.
, and
Hassabis
,
D.
,
2015
, “
Human-Level Control Through Deep Reinforcement Learning
,”
Nature
,
518
(
7540
), pp.
529
533
. 10.1038/nature14236
41.
Watkins
,
C. J.
, and
Dayan
,
P.
,
1992
, “
Q-learning
,”
Mach. Learn.
,
8
(
3–4
), pp.
279
292
.
42.
Zuo
,
G.
,
Du
,
T.
, and
Lu
,
J.
,
2017
, “
Double DQN Method for Object Detection
,”
2017 Chinese Automation Congress (CAC)
,
Jinan, China
,
Oct. 20–22
, pp.
6727
6732
.
43.
Andrychowicz
,
M.
,
Wolski
,
F.
,
Ray
,
A.
,
Schneider
,
J.
,
Fong
,
R.
,
Welinder
,
P.
,
McGrew
,
B.
,
Tobin
,
J.
,
Pieter Abbeel
,
O.
, and
Zaremba
,
W.
,
2017
, “
Hindsight Experience Replay
,”
International Conference on Neural Information Processing Systems
,
Long Beach, CA
,
Dec. 4–9
, pp.
5048
5058
.
44.
Fang
,
M.
,
Zhou
,
C.
,
Shi
,
B.
,
Gong
,
B.
,
Xi
,
W.
,
Wang
,
T.
,
Xu
,
J.
, and
Zhang
,
T.
,
2019
, “
DHER: Hindsight Experience Replay for Dynamic Goalsm
,”
International Conference on Learning Representations
,
New Orleans, LA
,
May 6–9
.
45.
Nair
,
A. V.
,
Pong
,
V.
,
Dalal
,
M.
,
Bahl
,
S.
,
Lin
,
S.
, and
Levine
,
S.
,
2018
, “
Visual Reinforcement Learning With Imagined Goals
,”
International Conference on Neural Information Processing Systems
,
Montreal, Canada
,
Dec. 3–8
, pp.
9191
9200
.
46.
Plappert
,
M.
,
Andrychowicz
,
M.
,
Ray
,
A.
,
McGrew
,
B.
,
Baker
,
B.
,
Powell
,
G.
,
Schneider
,
J.
,
Tobin
,
J.
,
Chociej
,
M.
,
Welinder
,
P.
,
Kumar
,
V.
, and
Zaremba
,
W.
,
2018
, “
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
,” e-print arXiv:1802.09464.
47.
Tipping
,
M. E.
, and
Bishop
,
C. M.
,
1999
, “
Probabilistic Principal Component Analysis
,”
J. R. Stat. Soc.: Ser. B (Stat. Methodol.)
,
61
(
3
), pp.
611
622
. 10.1111/rssb.1999.61.issue-3
48.
Othmer
,
C.
,
2008
, “
A Continuous Adjoint Formulation for the Computation of Topological and Surface Sensitivities of Ducted Flows
,”
Int. J. Numer. Methods Fluids
,
58
(
8
), pp.
861
877
. 10.1002/fld.v58:8
You do not currently have access to this content.