Abstract

For centuries, researchers have sought out ways to connect disparate areas of knowledge. While early scholars (Galileo, da Vinci, etc.) were experts across fields, specialization took hold later. With the advent of Artificial Intelligence, we can now explore relationships across areas (e.g., mechanics-biology) or disparate domains (e.g., failure mechanics-art). To achieve this, we use a fine-tuned large language model (LLM), here for a subset of knowledge in multiscale materials failure. The approach includes the use of a general-purpose LLM to distill question-answer pairs from raw sources followed by LLM fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas. While the model has some ability to recall knowledge from training, we find that LLMs are particularly useful for extracting structural insights through Ontological Knowledge Graphs. These interpretable graph structures provide explanatory insights, frameworks for new research questions, and visual representations of knowledge that also can be used in retrieval-augmented generation. Three versions of MechGPT are discussed, featuring different sizes from 13 × 109 to 70 × 109 parameters, and reaching context lengths of more than 10,000 tokens. This provides ample capacity for sophisticated retrieval augmented strategies, as well as agent-based modeling where multiple LLMs interact collaboratively and/or adversarially, the incorporation of new data from the literature or web searches, as well as multimodality.

References

1.
Radford
,
A.
,
Wu
,
J.
,
Child
,
R.
,
Luan
,
D.
,
Amodei
,
D.
, and
Sutskever
,
I.
,
2023
, “
Language Models Are Unsupervised Multitask Learners
,” OpenAI, accessed Oct. 19, 2023, life-extension.github.io
2.
Brown
,
T. B.
,
Mann
,
B.
,
Ryder
,
N.
,
Subbiah
,
M.
,
Kaplan
,
J.
,
Dhariwal
,
P.
,
Neelakantan
,
A.
, et al.,
2020
, “
Language Models Are Few-Shot Learners
,”
Adv. Neural Inf. Process Syst.
,
2020
, pp.
1877
1901
.https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
3.
Buehler
,
M. J.
,
2023
, “
Generative Pretrained Autoregressive Transformer Graph Neural Network Applied to the Analysis and Discovery of Novel Proteins
,” J. Appl. Phys., 124, p.
084902
.10.1063/5.0157367
4.
Bates
,
M.
,
1995
, “
Models of Natural Language Understanding
,”
Proc. Natl. Acad. Sci. U. S. A.
,
92
(
22
), pp.
9977
9982
.10.1073/pnas.92.22.9977
5.
Thoppilan
,
R.
,
De Freitas
,
D.
,
Hall
,
J.
,
Shazeer
,
N.
,
Kulshreshtha
,
A.
,
Cheng
,
H.-T.
,
Jin
,
A.
, et al.,
2022
, “
LaMDA: Language Models for Dialog Applications
,”
arxiv:2201.08239
.10.48550/arXiv.2201.08239
6.
Chowdhery
,
A.
,
Narang
,
S.
,
Devlin
,
J.
,
Bosma
,
M.
,
Mishra
,
G.
,
Roberts
,
A.
,
Barham
,
P.
, et al.,
2022
, “
PaLM: Scaling Language Modeling With Pathways
,”
arxiv:2204.02311
.10.48550/arXiv.2204.02311
7.
Taylor
,
R.
,
Kardas
,
M.
,
Cucurull
,
G.
,
Scialom
,
T.
,
Hartshorn
,
A.
,
Saravia
,
E.
,
Poulton
,
A.
,
Kerkez
,
V.
, and
Stojnic
,
R.
,
2022
, “
Galactica: A Large Language Model for Science
,”
arxiv:2211.09085
.10.48550/arXiv.2211.09085
8.
Radford
,
A.
,
Narasimhan
,
K.
,
Salimans
,
T.
, and
Sutskever
,
I.
,
2018
, “
Improving Language Understanding by Generative Pre-Training
,” OpenAI, https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
9.
Radford
,
A.
,
Kim
,
J. W.
,
Hallacy
,
C.
,
Ramesh
,
A.
,
Goh
,
G.
,
Agarwal
,
S.
,
Sastry
,
G.
, et al.,
2021
, “
Learning Transferable Visual Models From Natural Language Supervision
,”
arxiv:2103.00020
.10.48550/arXiv.2103.00020
10.
Brodnik
,
N.
,
Carton
,
S.
,
Muir
,
C.
,
Ghosh
,
S.
,
Downey
,
D.
,
Echlin
,
M. P.
,
Pollock
,
T. M.
, and
Daly
,
S. H.
,
2023
, “
Perspective: Large Language Models in Applied Mechanics
,”
ASME J. Appl. Mech.
, 90(10), p. 101008.10.1115/1.4062773
11.
Hu
,
Y.
, and
Buehler
,
M. J.
,
2023
, “
Deep Language Models for Interpretative and Predictive Materials Science
,”
APL Mach. Learn.
,
1
(
1
), p.
010901
.10.1063/5.0134317
12.
Peng
,
G. C. Y.
,
Alber
,
M.
,
Tepole
, A.
B.
,
Cannon
,
W. R.
,
De
,
S.
,
Dura-Bernal
,
S.
,
Garikipati
,
K.
, et al., 2021, “
Multiscale Modeling Meets Machine Learning: What Can We Learn?
,”
Arch. Comput. Methods Eng.
,
28
, pp.
1017
1037
.10.1007/s11831-020-09405-5
13.
Luu
,
R. K.
, and
Buehler
,
M. J.
,
2023
, “
Materials Informatics Tools in the Context of Bio-Inspired Material Mechanics
,”
ASME J. Appl. Mech.
,
90
(
9
), p.
090801
.10.1115/1.4062310
14.
Luu
,
R. K.
,
Wysokowski
,
M.
, and
Buehler
,
M. J.
,
2023
, “
Generative Discovery of Novel Chemical Designs Using Diffusion Modeling and Transformer Deep Neural Networks With Application to Deep Eutectic Solvents
,”
Appl. Phys. Lett.
,
122
(
23
), p.
234103
.https://www.researchgate.net/publication/370262762_Generative_Discovery_of_Novel_Chemical_Designs_using_Diffusion_Modeling_and_Transformer_Deep_Neural_Networks_with_Application_to_Deep_Eutectic_Solvents
15.
Buehler
,
M. J.
,
2022
, “
Modeling Atomistic Dynamic Fracture Mechanisms Using a Progressive Transformer Diffusion Model
,”
ASME J. Appl. Mech.
,
89
(
12
), p.
121009
.10.1115/1.4055730
16.
Buehler
,
M. J.
,
2023
, “
Predicting Mechanical Fields Near Cracks Using a Progressive Transformer Diffusion Model and Exploration of Generalization Capacity
,”
J. Mater. Res.
,
38
(
5
), pp.
1317
1331
.10.1557/s43578-023-00892-3
17.
Bottou
,
L.
, and
Schölkopf
,
B.
,
2023
, “
Borges and AI
,”
arxiv:2310.01425
.10.48550/arXiv.2310.01425
18.
van der Zant
,
T.
,
Kouw
,
M.
, and
Schomaker
,
L.
,
2013
, “
Generative Artificial Intelligence
,”
Stud. Appl. Philos. Epistemol. Ration. Ethics
,
5
, pp.
107
120
.10.1007/978-3-642-31674-6
19.
Ge
,
Y.
,
Hua
,
W.
,
Mei
,
K.
,
Ji
,
J.
,
Tan
,
J.
,
Xu
,
S.
,
Li
,
Z.
, and
Zhang
,
Y.
,
2023
, “
OpenAGI: When LLM Meets Domain Experts
,”
arxiv:2304.04370
.10.48550/arXiv.2304.04370
20.
Harrer
,
S.
,
2023
, “
Attention is Not All You Need: The Complicated Case of Ethically Using Large Language Models in Healthcare and Medicine
,”
EBioMedicine
,
90
, p.
104512
.10.1016/j.ebiom.2023.104512
21.
Jung
,
G. S.
, and
Buehler
,
M. J.
,
2017
, “
Multiscale Modeling of Muscular-Skeletal Systems
,”
Annu. Rev. Biomed. Eng.
,
19
(
1
), pp.
435
457
.10.1146/annurev-bioeng-071516-044555
22.
Barreiro
,
D. L.
,
Yeo
,
J.
,
Tarakanova
,
A.
,
Martin-Martinez
,
F. J.
, and
Buehler
,
M. J.
,
2019
, “
Multiscale Modeling of Silk and Silk-Based Biomaterials—A Review
,”
Macromol. Biosci.
,
19
(
3
), p. 1800253
.10.1002/mabi.201800253
23.
Chen
,
X.
, and
Drapaca
,
C.
,
2022
, “
On the Dissipation of Conforming and Discontinuous Galerkin Schemes for the Incompressible Navier-Stokes Equations
,”
AIP Adv.
,
12
(
7
), p.
75004
.10.1063/5.0080842
24.
Aboelkassem
,
Y.
,
Powers
,
J. D.
,
McCabe
,
K. J.
, and
McCulloch
,
A. D.
,
2019
, “
Multiscale Models of Cardiac Muscle Biophysics and Tissue Remodeling in Hypertrophic Cardiomyopathies
,”
Curr. Opin. Biomed. Eng.
,
11
, pp.
35
44
.10.1016/j.cobme.2019.09.005
25.
Bock
,
F. E.
,
Aydin
,
R. C.
,
Cyron
,
C. J.
,
Huber
,
N.
,
Kalidindi
,
S. R.
, and
Klusemann
,
B.
,
2019
, “
A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics
,”
Front. Mater.
,
6
.10.3389/fmats.2019.00110
26.
Buehler
,
M. J.
,
2023
, “
MeLM, a Generative Pretrained Language Modeling Framework That Solves Forward and Inverse Mechanics Problems
,”
J. Mech. Phys. Solids
, 181, p.
105454
.10.1016/j.jmps.2023.105454
27.
Lee
,
N.
,
Hunter
,
C. J.
,
Ruiz
,
N.
,
Goodson
,
B.
,
Lian
,
W.
,
Wang
,
G.
,
Pentland
,
E.
,
Cook
,
A.
,
Vong
,
C.
, and “Teknium”, 2023, “
OpenOrcaPlatypus: Llama2-13B Model Instruct-Tuned on Filtered OpenOrcaV1 GPT-4 Dataset and Merged With Divergent STEM and Logic Dataset Model
,” HuggingFace repository, accessed Aug. 27, 2023, https://huggingface.co/Open-Orca/OpenOrca-Platypus2-13B
28.
Veličković
,
P.
,
Casanova
,
A.
,
Liò
,
P.
,
Cucurull
,
G.
,
Romero
,
A.
, and
Bengio
,
Y.
,
2017
, “
Graph Attention Networks
,”
Sixth International Conference on Learning Representations, ICLR 2018—Conference Track Proceedings
, Vancouver, BC, Canada, Apr. 30–May 3.
29.
Wolfram, S., 2023, “
ChatGPT Gets Its ‘Wolfram Superpowers’!—Stephen Wolfram Writings
,” accessed June 26, 2023, https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/
30.
He-Yueya
,
J.
,
Poesia
,
G.
,
Wang
,
R. E.
, and
Goodman
,
N. D.
,
2023
, “
Solving Math Word Problems by Combining Language Models With Symbolic Solvers
,”
arxiv:2304.09102
.https://arxiv.org/pdf/2304.09102
31.
Zhong
,
W.
,
Cui
,
R.
,
Guo
,
Y.
,
Liang
,
Y.
,
Lu
,
S.
,
Wang
,
Y.
,
Saied
,
A.
,
Chen
,
W.
, and
Duan
,
N.
,
2023
, “
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
,”
arxiv:2304.06364
.10.48550/arXiv.2304.06364
32.
Buehler
,
M. J.
,
2008
,
Atomistic Modeling of Materials Failure
, Springer, Berlin.
33.
Hu
,
E. J.
,
Shen
,
Y.
,
Wallis
,
P.
,
Allen-Zhu
,
Z.
,
Li
,
Y.
,
Wang
,
S.
,
Wang
,
L.
, and
Chen
,
W.
,
2021
, “
LoRA: Low-Rank Adaptation of Large Language Models
,”
arxiv:2106.09685
.10.48550/arXiv.2106.09685
34.
Wang
,
C.
,
Liu
,
X.
,
Yue
,
Y.
,
Tang
,
X.
,
Zhang
,
T.
,
Jiayang
,
C.
,
Yao
,
Y.
, et al.,
2023
, “
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
,”
arxiv:2310.07521
.10.48550/arXiv.2310.07521
35.
Giesa
,
T.
,
Spivak
,
D. I.
, and
Buehler
,
M. J.
,
2011
, “
Reoccurring Patterns in Hierarchical Protein Materials and Music: The Power of Analogies
,”
BioNanoScience
,
1
(
4
), pp.
153
161
.10.1007/s12668-011-0022-5
36.
Schiøtz
,
J.
, and
Jacobsen
,
K. W.
,
2003
, “
A Maximum in the Strength of Nanocrystalline Copper
,”
Science (1979)
,
301
(
5638
), pp.
1357
1359
.10.1126/science.1086636
37.
Čanađija
,
M.
,
2021
, “
Deep Learning Framework for Carbon Nanotubes: Mechanical Properties and Modeling Strategies
,”
Carbon N Y
,
184
, pp.
891
901
.10.1016/j.carbon.2021.08.091
38.
Su
,
J.
,
Lu
,
Y.
,
Pan
,
S.
,
Murtadha
,
A.
,
Wen
,
B.
, and
Liu
,
Y.
,
2021
, “
RoFormer: Enhanced Transformer With Rotary Position Embedding
,”
arxiv:2104.09864
.https://arxiv.org/pdf/2104.09864.pdf
39.
Qin
,
Z.
, and
Buehler
,
M. J.
,
2013
, “
Bioinspired Graphene Nanogut
,”
ASME J. Appl. Mech.
,
80
(
6
), p.
061009
.10.1115/1.4023641
40.
Blecher
,
L.
,
Cucurull
,
G.
,
Scialom
,
T.
,
Stojnic
,
R.
, and
Ai
,
M.
,
2023
, “
Nougat: Neural Optical Understanding for Academic Documents
,”
arxiv:2308.13418
.10.48550/arXiv.2308.13418
41.
Lewis
,
P.
,
Perez
,
E.
,
Piktus
,
A.
,
Petroni
,
F.
,
Karpukhin
,
V.
,
Goyal
,
N.
,
Küttler
,
H.
, et al.,
2020
, “
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
,”
Adv. Neural Inf. Process Syst.
, Vancouver, BC, Canada.https://proceedings.neurips.cc/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf
42.
Society of Engineering Science, 2023, “
Homepage - Society of Engineering Science
,” accessed Oct. 15, 2023, https://socengsci.org/
43.
Giesa
,
T.
,
Spivak
,
D. I.
, and
Buehler
,
M. J.
,
2012
, “
Category Theory Based Solution for the Building Block Replacement Problem in Materials Design
,”
Adv. Eng. Mater.
,
14
(
9
), pp.
810
817
.10.1002/adem.201200109
44.
Dhuliawala
,
S.
,
Ai
,
M.
,
Zürich
,
E.
,
Komeili
,
M.
,
Xu
,
J.
,
Raileanu
,
R.
,
Li
,
X.
,
Celikyilmaz
,
A.
, and
Weston
,
J.
,
2023
, “
Chain-of-Verification Reduces Hallucination in Large Language Models
,”
arxiv:2309.11495
.10.48550/arXiv.2309.11495
45.
Sung Park
,
J.
, O'Brien,
J. C.
,
Cai
,
C. J.
,
Ringel Morris
,
M.
,
Liang
,
P.
,
Bernstein
,
M. S.
,
Park
,
J.
,
Cai
,
C.
,
Morris
,
M.
,
Liang
,
P.
, and
Bernstein
,
M.
,
2023
, “
Generative Agents: Interactive Simulacra of Human Behavior
,”
The 36th Annual ACM Symposium on User Interface Software and Technology
(
UIST'23
), San Francisco, CA, Oct. 29–Nov. 1,
p.
1
.10.1145/3586183.3606763
46.
Fernando
,
C.
,
Banarse
,
D.
,
Michalewski
,
H.
,
Osindero
,
S.
,
Rocktäschel
,
T.
, and
Deepmind
,
G.
,
2023
, “Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution,”
arxiv:2309.16797
.10.48550/arXiv.2309.16797
47.
Chen
,
B.
,
Zhang
,
Z.
,
Langrené
,
N.
, and
Zhu
,
S.
,
2023
, “Unleashing the Potential of Prompt Engineering in Large Language Models: A Comprehensive Review,”
arxiv:2310.14735
.10.48550/arXiv.2310.14735
48.
Chen
,
W.
,
Ma
,
X.
,
Wang
,
X.
, and
Cohen
,
W. W.
,
2022
, “
Program of Thoughts Prompting: Disentangling Computation From Reasoning for Numerical Reasoning Tasks
,”
arxiv:2211.12588
.10.48550/arXiv.2211.12588
49.
Chern
,
I.-C.
,
Chern
,
S.
,
Chen
,
S.
,
Yuan
,
W.
,
Feng
,
K.
,
Zhou
,
C.
,
He
,
J.
,
Neubig
,
G.
,
Liu
,
P.
, and
Jiao
,
S.
,
2023
, “
FacTool: Factuality Detection in Generative AI—A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
,”
arxiv:2307.13528
.10.48550/arXiv.2307.13528
50.
Dettmers
,
T.
,
Pagnoni
,
A.
,
Holtzman
,
A.
, and
Zettlemoyer
,
L.
,
2023
, “
QLoRA: Efficient Finetuning of Quantized LLMs
,”
arxiv:2305.14314
.10.48550/arXiv.2305.14314
51.
Touvron
,
H.
,
Martin
,
L.
,
Stone
,
K.
,
Albert
,
P.
,
Almahairi
,
A.
,
Babaei
,
Y.
,
Bashlykov
,
N.
, et al.,
2023
, “
Llama 2: Open Foundation and Fine-Tuned Chat Models
,”
arxiv:2307.09288
.10.48550/arXiv.2307.09288
52.
Paszke
,
A.
,
Gross
,
S.
,
Massa
,
F.
,
Lerer
,
A.
,
Bradbury
,
J.
,
Chanan
,
G.
,
Killeen
,
T.
, et al.,
2019
, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,”
arxiv:1912.01703
.10.48550/arXiv.1912.01703
53.
Kingma
,
D. P.
, and
Ba
,
J.
,
2014
, “
Adam: A Method for Stochastic Optimization
,”
arxiv:1412.6980
.10.48550/arXiv.1412.6980
54.
Abid
,
A.
,
Abdalla
,
A.
,
Abid
,
A.
,
Khan
,
D.
,
Alfozan
,
A.
, and
Zou
,
J.
,
2019
, “Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild,”
arxiv:1906.02569
.10.48550/arXiv.1906.02569
55.
Vaswani
,
A.
,
Shazeer
,
N.
,
Parmar
,
N.
,
Uszkoreit
,
J.
,
Jones
,
L.
,
Gomez
,
A. N.
,
Kaiser
,
Ł.
, and
Polosukhin
,
I.
,
2017
, “
Attention is All You Need
,”
Advances in Neural Information Processing Systems
, Neural Information Processing Systems Foundation, NIPS 2017, 30, Vancouver, ON, Canada, Dec. 4–7, pp.
5999
6009
.https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
56.
Buehler
,
M. J.
, and
Gao
,
H.
,
2004
, “
A Mother-Daughter-Granddaughter Mechanism of Shear Dominated Intersonic Crack Motion Along Interfaces of Dissimilar Materials
,”
J. Chin. Inst. Eng. Trans. Chin. Inst. Eng. Ser. A
,
27
(
6
), pp.
763
769
.10.1080/02533839.2004.9670927
You do not currently have access to this content.