Graphical Abstract Figure
Graphical Abstract Figure
Close modal

Abstract

Materials science requires the collection and analysis of great quantities of data. These data almost invariably require various post-acquisition computation to remove noise, classify observations, fit parametric models, or perform other operations. Recently developed machine-learning (ML) algorithms have demonstrated great capability for performing many of these operations, and often produce higher quality output than traditional methods. However, it has been widely observed that such algorithms often suffer from issues such as limited generalizability and the tendency to “over fit” to the input data. In order to address such issues, this work introduces a metacomputing framework capable of systematically selecting, tuning, and training the best available machine-learning model in order to process an input dataset. In addition, a unique “cross-training” methodology is used to incorporate underlying physics or multiphysics relationships into the structure of the resultant ML model. This metacomputing approach is demonstrated on four example problems: repairing “gaps” in a multiphysics dataset, improving the output of electron back-scatter detection crystallographic measurements, removing spurious artifacts from X-ray microtomography data, and identifying material constitutive relationships from tensile test data. The performance of the metacomputing framework on these disparate problems is discussed, as are future plans for further deploying metacomputing technologies in the context of materials science and mechanical engineering.

References

1.
Himanen
,
L.
,
Geurts
,
A.
,
Foster
,
A. S.
, and
Rinke
,
P.
,
2019
, “
Data-Driven Materials Science: Status, Challenges, and Perspectives
,”
Adv. Sci.
,
6
(
21
), p.
1900808
.
2.
Panchal
,
J. H.
,
Kalidindi
,
S. R.
, and
McDowell
,
D. L.
,
2013
, “
Key Computational Modeling Issues in Integrated Computational Materials Engineering
,”
Comput.-Aided Design
,
45
(
1
), pp.
4
25
. Computer-Aided Multi-Scale Materials and Product Design.
3.
Potyrailo
,
R. A.
, and
Takeuchi
,
I.
,
2004
, “
Role of High-Throughput Characterization Tools in Combinatorial Materials Science
,”
Meas. Sci. Technol.
,
16
(
1
), p.
1
.
4.
Maier
,
W. F.
,
Stoewe
,
K.
, and
Sieg
,
S.
,
2007
, “
Combinatorial and High-Throughput Materials Science
,”
Angew. Chem., Int. Ed.
,
46
(
32
), pp.
6016
6067
.
5.
Rowenhorst
,
D.
,
Gupta
,
A.
,
Feng
,
C.
, and
Spanos
,
G.
,
2006
, “
3D Crystallographic and Morphological Analysis of Coarse Martensite: Combining EBSD and Serial Sectioning
,”
Scr. Mater.
,
55
(
1
), pp.
11
16
.
6.
Agrawal
,
A.
, and
Choudhary
,
A.
,
2016
, “
Perspective: Materials Informatics and Big Data: Realization of the “Fourth Paradigm” of Science in Materials Science
,”
APL Mater.
,
4
(
5
), p.
053208
.
7.
Schleder
,
G. R.
,
Padilha
,
A. C. M.
,
Acosta
,
C. M.
,
Costa
,
M.
, and
Fazzio
,
A.
,
2019
, “
From DFT to Machine Learning: Recent Approaches to Materials Science-A Review
,”
J. Phys.: Mater.
,
2
(
3
), p.
032001
.
8.
Wei
,
J.
,
Chu
,
X.
,
Sun
,
X.-Y.
,
Xu
,
K.
,
Deng
,
H.-X.
,
Chen
,
J.
,
Wei
,
Z.
, and
Lei
,
M.
,
2019
, “
Machine Learning in Materials Science
,”
InfoMat
,
1
(
3
), pp.
338
358
.
9.
Alloghani
,
M.
,
Al-Jumeily
,
D.
,
Mustafina
,
J.
,
Hussain
,
A.
, and
Aljaaf
,
A. J.
,
2020
,
A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science
,
Springer International Publishing
,
Cham
, pp.
3
21
.
10.
Tran
,
A.
,
Maupin
,
K.
, and
Rodgers
,
T.
,
2023
, “
Monotonic Gaussian Process for Physics-Constrained Machine Learning With Materials Science Applications
,”
ASME J. Comput. Inf. Sci. Eng.
,
23
(
1
), p.
011011
.
11.
Azzi
,
M.-J.
,
Ghnatios
,
C.
,
Avery
,
P.
, and
Farhat
,
C.
,
2023
, “
Acceleration of a Physics-Based Machine Learning Approach for Modeling and Quantifying Model-Form Uncertainties and Performing Model Updating
,”
ASME J. Comput. Inf. Sci. Eng.
,
23
(
1
), p.
011009
.
12.
Brunton
,
S. L.
, and
Kutz
,
J. N.
,
2022
,
Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control
,
Cambridge University Press
,
Cambridge, UK
.
13.
Ying
,
X.
,
2019
, “
An Overview of Overfitting and Its Solutions
,”
J. Phys.: Conf. Ser.
,
1168
(
2
), p.
022022
.
14.
Jakubovitz
,
D.
,
Giryes
,
R.
, and
Rodrigues
,
M. R. D.
,
2019
,
Generalization Error in Deep Learning
,
Springer International Publishing
,
Cham
, pp.
153
193
.
15.
Nadeau
,
C.
, and
Bengio
,
Y.
,
1999
, “Inference for the Generalization Error,” In Advances in Neural Information Processing Systems, S. Solla, T. Leen, and K. Müller, eds., Vol. 12, MIT Press.
16.
Ince
,
D.
, ed.
2019
,
A Dictionary of the Internet
,
Oxford University Press
,
Oxford, UK
.
17.
Smarr
,
L.
, and
Catlett
,
C. E.
,
1992
, “
Metacomputing
,”
Commun. ACM
,
35
(
6
), pp.
44
52
.
18.
Foster
,
I.
, and
Kesselman
,
C.
,
1997
, “
Globus: A Metacomputing Infrastructure Toolkit
,”
Int. J. Supercomput. Appl. High Perform. Comput.
,
11
(
2
), pp.
115
128
.
19.
Matyska
,
L.
, and
Ruda
,
M.
,
1997
, “
Metacomputing. New Direction in High Performance Computing
,”
In Information Technology Applications in Biomedicine. ITAB ’97, Proceedings of the IEEE Engineering in Medicine and Biology Society Region 8 International Conference
, pp.
106
108
.
20.
Brune
,
M.
,
Gehring
,
J.
,
Keller
,
A.
,
Monien
,
B.
,
Ramme
,
F.
, and
Reinefeld
,
A.
,
1998
, “
Specifying Resources and Services in Metacomputing Environments
,”
Parallel Comput.
,
24
(
12
), pp.
1751
1776
.
21.
Gentzsch
,
W.
,
1999
, “
Metacomputing: From Workstation Clusters to Internet Computing
,”
Future Gener. Comput. Syst.
,
15
(
5–6
), pp.
537
538
.
22.
Laforenza
,
D.
,
2001
, “
From Metacomputing to Grid Computing, Evolution or Revolution?
,” In
SOFSEM 2001: Theory and Practice of Informatics
,
L.
Pacholski
and
P.
Ružička
, eds., Springer Berlin Heidelberg, pp.
73
74
.
23.
Baraglia
,
R.
,
Ferrini
,
R.
, and
Laforenza
,
D.
,
2002
, “
Meta ψ: A Web-Based Metacomputing Environment to Build a Computational Chemistry Problem Solving Environment
”. In
Proceedings of the 10th Euromicro Conference on Parallel, Distributed and Network-Based Processing, EUROMICRO-PDP’02, IEEE Computer Society
, pp.
49
54
.
24.
Lilis
,
Y.
, and
Savidis
,
A.
,
2019
, “
A Survey of Metaprogramming Languages
,”
ACM Comput. Surv.
,
52
(
6
), pp
1
39
.
25.
Michopoulos
,
J.
,
Apetre
,
N.
,
Steuben
,
J.
, and
Iliopoulos
,
A.
,
2023
, “
Top-Down Metacomputing With Algebraic Dimensionality Raising for Automating Theory-Building to Enable Directly Computable Multiphysics Models
,”
J. Comput. Sci.
,
73
, p.
102142
.
26.
Michopoulos
,
J. G.
,
Iliopoulos
,
A. P.
,
Steuben
,
J. C.
, and
Apetre
,
N. A.
,
2023
, “
Metacomputing for Directly Computable Multiphysics Models
,”
ASME J. Comput. Inf. Sci. Eng.
,
23
(
6
), p.
060820
.
27.
Turner
,
C. J.
, and
Crawford
,
R. H.
,
2005
, “
Selecting an Appropriate Metamodel: The Case for NURBs Metamodels
”.
Volume 2: 31st Design Automation Conference, Parts A and B of International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
, pp.
759
771
.
28.
Wolpert
,
D. H.
,
2002
,
The Supervised Learning No-Free-Lunch Theorems
,
Springer London
,
London
, pp.
25
42
.
29.
Falk
,
T.
,
Mai
,
D.
,
Bensch
,
R.
,
Çiçek
,
Ö.
,
Abdulkadir
,
A.
,
Marrakchi
,
Y.
,
Böhm
,
A.
,
Deubner
,
J.
,
Jäckel
,
Z.
,
Seiwald
,
K.
, et al.,
2019
, “
U-Net: Deep Learning for Cell Counting, Detection, and Morphometry
,”
Nat. Methods
,
16
(
1
), pp.
67
70
.
30.
Maška
,
M.
,
Ulman
,
V.
,
Delgado-Rodriguez
,
P.
,
Gómez-de Mariscal
,
E.
,
Nečasová
,
T.
,
Guerrero Peña
,
F. A.
,
Ren
,
T. I.
,
Meyerowitz
,
E. M.
,
Scherr
,
T.
,
Löffler
,
K.
, et al.,
2023
, “
The Cell Tracking Challenge: 10 Years of Objective Benchmarking
,”
Nat. Methods
, pp.
1
11
.
31.
Jin
,
R.
,
Chen
,
W.
, and
Sudjianto
,
A.
,
2002
, “
On Sequential Sampling for Global Metamodeling in Engineering Design
”. In
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
, Vol.
36223
, pp.
539
548
.
32.
Vehtari
,
A.
,
Gelman
,
A.
, and
Gabry
,
J.
,
2017
, “
Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC
,”
Statist. Comput.
,
27
, pp.
1413
1432
.
33.
Aittokallio
,
T.
,
2009
, “
Dealing With Missing Values in Large-Scale Studies: Microarray Data Imputation and Beyond
,”
Brief. Bioinform.
,
11
(
2
), pp.
253
264
.
34.
Liew
,
A. W. -C.
,
Law
,
N. -F.
, and
Yan
,
H.
,
2010
, “
Missing Value Imputation for Gene Expression Data: Computational Techniques to Recover Missing Data From Available Information
,”
Brief. Bioinform.
,
12
(
5
), pp.
498
513
.
35.
Luo
,
Y.
,
2021
, “
Evaluating the State of the Art in Missing Data Imputation for Clinical Data
,”
Brief. Bioinform.
,
23
(
1
), p.
bbab489
.
36.
Chen
,
T.
,
Li
,
M.
,
Li
,
Y.
,
Lin
,
M.
,
Wang
,
N.
,
Wang
,
M.
,
Xiao
,
T.
,
Xu
,
B.
,
Zhang
,
C.
, and
Zhang
,
Z.
,
2015
, “
Mxnet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
,”
preprint
arXiv:1512.01274.
37.
Huang
,
G.
,
2021
, “
Missing Data Filling Method Based on Linear Interpolation and lightgbm
,” In
Journal of Physics: Conference Series
, Vol.
1754
, IOP Publishing, p.
012187
.
38.
Garcia
,
S.
,
Derrac
,
J.
,
Cano
,
J.
, and
Herrera
,
F.
,
2012
, “
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
,”
IEEE Trans. Pattern Anal. Mach. Intell.
,
34
(
3
), pp.
417
435
.
39.
Myers
,
R. H.
,
Montgomery
,
D. C.
, and
Christine
,
M.
,
2009
,
Anderson Cook, CM: Response Surface Methodology: Process and Product Optimization Using Designed Experiments
,
John Wiley & Sons
,
New York
.
40.
Dinh
,
H. Q.
,
Turk
,
G.
, and
Slabaugh
,
G.
,
2002
, “
Reconstructing Surfaces by Volumetric Regularization Using Radial Basis Functions
,”
IEEE Trans. Pattern Anal. Mach. Intell.
,
24
(
10
), pp.
1358
1371
.
41.
Cressie
,
N.
,
1990
, “
The Origins of Kriging
,”
Math. Geology
,
22
(
3
), pp.
239
252
.
42.
Schulz
,
E.
,
Speekenbrink
,
M.
, and
Krause
,
A.
,
2018
, “
A Tutorial on Gaussian Process Regression: Modelling, Exploring, and Exploiting Functions
,”
J. Math. Psychol.
,
85
, pp.
1
16
.
43.
Specht
,
D. F.
, et al.,
1991
, “
A General Regression Neural Network
,”
IEEE Trans. Neural Netw.
,
2
(
6
), pp.
568
576
.
44.
Eason
,
J.
, and
Cremaschi
,
S.
,
2014
, “
Adaptive Sequential Sampling for Surrogate Model Generation With Artificial Neural Networks
,”
Comput. Chem. Eng.
,
68
, pp.
220
232
.
45.
Hearst
,
M. A.
,
Dumais
,
S. T.
,
Osuna
,
E.
,
Platt
,
J.
, and
Scholkopf
,
B.
,
1998
, “
Support Vector Machines
,”
IEEE Intel. Syst. Their Appl.
,
13
(
4
), pp.
18
28
.
46.
Awad
,
M.
,
Khanna
,
R.
,
Awad
,
M.
, and
Khanna
,
R.
,
2015
, “
Support Vector Regression
,”
Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers
, pp.
67
80
.
47.
Quinlan
,
J. R.
,
1996
, “
Learning Decision Tree Classifiers
,”
ACM Comput. Surveys (CSUR)
,
28
(
1
), pp.
71
72
.
48.
Belyaev
,
M.
,
Burnaev
,
E.
,
Kapushev
,
E.
,
Panov
,
M.
,
Prikhodko
,
P.
,
Vetrov
,
D.
, and
Yarotsky
,
D.
,
2016
, “
Gtapprox: Surrogate Modeling for Industrial Design
,”
Adv. Eng. Soft.
,
102
, pp.
29
39
.
49.
Friedman
,
J. H.
,
2001
, “
Greedy Function Approximation: A Gradient Boosting Machine
,”
Ann. Statist.
,
29
(
5
), pp.
1189
1232
.
50.
Holloway
,
J.
,
Helmstedt
,
K. J.
,
Mengersen
,
K.
, and
Schmidt
,
M.
,
2019
, “
A Decision Tree Approach for Spatially Interpolating Missing Land Cover Data and Classifying Satellite Images
,”
Remote Sens.
,
11
(
15
), p.
1796
.
51.
Stein
,
C. M.
,
1981
, “
Estimation of the Mean of a Multivariate Normal Distribution
,”
Ann. Stat.
,
9
(
6
), pp.
1135
1151
.
52.
Fienberg
,
S. E.
,
1970
, “
An Iterative Procedure for Estimation in Contingency Tables
,”
Ann. Math. Stat.
,
41
(
3
), pp.
907
917
.
53.
Terrell
,
G. R.
, and
Scott
,
D. W.
,
1992
, “
Variable Kernel Density Estimation
,”
Ann. Statist.
,
20
(
3
), pp.
1236
1265
.
54.
Reynolds
,
D. A.
, et al.,
2009
, “
Gaussian Mixture Models.
,”
Encyclopedia Biom.
,
741
, pp
659
663
.
55.
De Boer
,
P.-T.
,
Kroese
,
D. P.
,
Mannor
,
S.
, and
Rubinstein
,
R. Y.
,
2005
, “
A Tutorial on the Cross-Entropy Method
,”
Ann. Operat. Res.
,
134
, pp.
19
67
.
56.
Wolfram Research Inc
., Mathematica, Version 13.0, Champaign, IL, 2022.
57.
Steuben
,
J. C.
,
Geltmacher
,
A. B.
,
Rodriguez
,
S. N.
,
Graber
,
B. D.
,
Iliopoulos
,
A. P.
, and
Michopoulos
,
J. G.
,
2023
, “Multiphysics Missing Data Synthesis (MIDAS): A Machine-Learning Approach for Mitigating Data Gaps and Artifacts,” In International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Vol. 87295, American Society of Mechanical Engineers, p. V002T02A090.
58.
Islam
,
S. T.
,
Ma
,
W.
,
Michopoulos
,
J. G.
, and
Wang
,
K.
,
2023
, “
Plasma Formation in Ambient Fluid From Hypervelocity Impacts
,”
Extreme Mech. Lett.
,
58
, p.
101927
.
59.
Schwartz
,
A. J.
,
Kumar
,
M.
,
Adams
,
B. L.
, and
Field
,
D. P.
,
2009
,
Electron Backscatter Diffraction in Materials Science
, Vol.
2
,
Springer
,
Heidelberg, Germany
.
60.
Rollett
,
A. D.
,
Lee
,
S. -B.
,
Campman
,
R.
, and
Rohrer
,
G.
,
2007
, “
Three-Dimensional Characterization of Microstructure by Electron Back-Scatter Diffraction
,”
Annu. Rev. Mater. Res.
,
37
(
1
), pp.
627
658
.
61.
Steuben
,
J. C.
,
Graber
,
B. D.
,
Iliopoulos
,
A. P.
, and
Michopoulos
,
J. G.
,
2022
, “
X-ray Marching for the Computational Modeling of Tomographic Systems Applied to Materials Applications
,” In
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
, Vol.
86212
, American Society of Mechanical Engineers, p.
V002T02A035
.
62.
Savitzky
,
A.
, and
Golay
,
M. J.
,
1964
, “
Smoothing and Differentiation of Data by Simplified Least Squares Procedures.
,”
Anal. Chem.
,
36
(
8
), pp.
1627
1639
.
You do not currently have access to this content.