Abstract

Semantic knowledge of part-part and part-whole relationships in assemblies is useful for a variety of tasks from searching design repositories to the construction of engineering knowledge bases. In this work, we propose that the natural language names designers use in computer aided design (CAD) software are a valuable source of such knowledge, and that large language models (LLMs) contain useful domain-specific information for working with this data as well as other CAD and engineering-related tasks. In particular, we extract and clean a large corpus of natural language part, feature, and document names and use this to quantitatively demonstrate that a pre-trained language model can outperform numerous benchmarks on three self-supervised tasks, without ever having seen this data before. Moreover, we show that fine-tuning on the text data corpus further boosts the performance on all tasks, thus demonstrating the value of the text data which until now has been largely ignored. We also identify key limitations to using LLMs with text data alone, and our findings provide a strong motivation for further work into multi-modal text-geometry models. To aid and encourage further work in this area we make all our data and code publicly available.

References

1.
Bespalov
,
D.
,
Ip
,
C. Y.
,
Regli
,
W. C.
, and
Shaffer
,
J.
,
2005
, “
Benchmarking CAD Search Techniques
,”
2005 ACM Symposium on Solid and Physical Modeling
,
Cambridge, MA
,
June 13–15
, Association for Computing Machinery, pp.
275
286
.
2.
Kwon
,
E.
,
Huang
,
F.
, and
Goucher-Lambert
,
K.
,
2022
, “
Enabling Multi-Modal Search for Inspirational Design Stimuli Using Deep Learning
,”
Artif. Intell. Eng. Des. Anal. Manuf.
,
36
(
1
), p.
e22
.
3.
Korbi
,
A.
,
Tlija
,
M.
, and
Louhichi
,
B.
,
2022
, “
A CAD Model for the Tolerancing of Mechanical Assemblies Considering Non-Rigid Joints Between Parts With Defects
,”
Proc. Inst. Mech. Eng. B
,
236
(
3
), pp.
219
232
.
4.
Jones
,
B.
,
Hildreth
,
D.
,
Chen
,
D.
,
Baran
,
I.
,
Kim
,
V. G.
, and
Schulz
,
A.
,
2021
, “
Automate: A Dataset and Learning Approach for Automatic Mating of CAD Assemblies
,”
ACM Trans. Graph. (TOG)
,
40
(
6
), pp.
1
18
.
5.
Willis
,
K. D.
,
Jayaraman
,
P. K.
,
Chu
,
H.
,
Tian
,
Y.
,
Li
,
Y.
,
Grandi
,
D.
, and
Sanghi
,
A.
,
2022
, “
Joinable: Learning Bottom-Up Assembly of Parametric CAD Joints
,”
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
,
New Orleans, LA
,
June 19–20
, pp.
15849
15860
.
6.
Shi
,
F.
,
Chen
,
L.
,
Han
,
J.
, and
Childs
,
P.
,
2017
, “
A Data-Driven Text Mining and Self-Learning Semantic Network Analysis for Design Knowledge Retrieval
,”
ASME J. Mech. Des.
,
139
(
11
). p.
111402
.
7.
Sarica
,
S.
,
Luo
,
J.
, and
Wood
,
K. L.
,
2020
, “
TechNet: Technology Semantic Network Based on Patent Data
,”
Expert Syst. Appl.
,
142
(
1
), p.
112995
.
8.
Feng
,
Y.
,
Zhao
,
Y.
,
Zheng
,
H.
,
Li
,
Z.
, and
Tan
,
J.
,
2020
, “
Data-Driven Product Design Toward Intelligent Manufacturing: A Review
,”
Int. J. Adv. Rob. Syst.
,
17
(
2
), p.
1729881420911257
.
9.
Schinko
,
C.
,
Vosgien
,
T.
,
Prante
,
T.
,
Schreck
,
T.
, and
Ullrich
,
T.
,
2017
, “
Search & Retrieval in CAD Databases – A User-Centric State-of-the-Art Overview
,”
12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications – GRAPP
,
Porto, Portugal
,
Jan. 27–29
, SciTePress, pp.
306
313
.
10.
Tan
,
Z.
,
Wang
,
S.
,
Yang
,
Z.
,
Chen
,
G.
,
Huang
,
X.
,
Sun
,
M.
, and
Liu
,
Y.
,
2020
, “
Neural Machine Translation: A Review of Methods, Resources, and Tools
,”
AI Open
,
1
(
1
), pp.
5
21
.
11.
Raffel
,
C.
,
Shazeer
,
N.
,
Roberts
,
A.
,
Lee
,
K.
,
Narang
,
S.
,
Matena
,
M.
,
Zhou
,
Y.
, et al.,
2020
, “
Exploring the Limits of Transfer Learning With a Unified Text-to-Text Transformer.
,”
J. Mach. Learn. Res.
,
21
(
140
), pp.
1
67
.
12.
Brown
,
T.
,
Mann
,
B.
,
Ryder
,
N.
,
Subbiah
,
M.
,
Kaplan
,
J. D.
,
Dhariwal
,
P.
, and
Neelakantan
,
A.
,
2020
, “
Language Models are Few-Shot Learners
,”
Neural Information Processing Systems
,
Online/Virtual
,
December
.
13.
Zhu
,
Y.
,
Kiros
,
R.
,
Zemel
,
R.
,
Salakhutdinov
,
R.
,
Urtasun
,
R.
,
Torralba
,
A.
, and
Fidler
,
S.
,
2015
, “
Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books
,”
IEEE International Conference on Computer Vision (ICCV)
,
Santiago, Chile
,
Dec. 11–18
.
14.
Mehmood
,
M. A.
,
Shafiq
,
H. M.
, and
Waheed
,
A.
,
2017
, “
Understanding Regional Context of World Wide Web Using Common Crawl Corpus
,”
Understanding Regional Context of World Wide Web Using Common Crawl Corpus
,
Johor Bahru, Malaysia
,
Nov. 28–30
, pp.
164
169
.
15.
Koch
,
S.
,
Matveev
,
A.
,
Jiang
,
Z.
,
Williams
,
F.
,
Artemov
,
A.
,
Burnaev
,
E.
, and
Alexa
,
M.
,
2019
, “
Abc: A Big CAD Model Dataset for Geometric Deep Learning
,”
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
,
Long Beach, CA
,
June 15–20
, pp.
9593
9603
.
16.
Vaswani
,
A.
,
Shazeer
,
N.
,
Parmar
,
N.
,
Uszkoreit
,
J.
,
Jones
,
L.
,
Gomez
,
A. N.
,
Kaiser
,
Ł.
, and
Polosukhin
,
I.
,
2017
, “
Attention is All You Need
,”
Adv. Neural Inf. Process. Syst.
,
30
.
17.
Zhang
,
C.
,
Lai
,
Y.
,
Feng
,
Y.
, and
Zhao
,
D.
,
2021
, “
A Review of Deep Learning in Question Answering Over Knowledge Bases
,”
AI Open
,
2
(
1
), pp.
205
215
.
18.
Radford
,
A.
,
Wu
,
J.
,
Child
,
R.
,
Luan
,
D.
,
Amodei
,
D.
, and
Sutskever
,
I.
,
2018
, “
Language Models Are Unsupervised Multitask Learners
”.
19.
Devlin
,
J.
,
Chang
,
M.
,
Lee
,
K.
, and
Toutanova
,
K.
,
2019
, “
BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding
,”
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Volume 1 (Long and Short Papers)
,
J.
Burstein
,
C.
Doran
, and
T.
Solorio
, eds., Minneapolis, MN, June 2–7, pp.
4171
4186
.
20.
Sanh
,
V.
,
Debut
,
L.
,
Chaumond
,
J.
, and
Wolf
,
T.
,
2020
,
DistilBERT, A Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter, February
.
21.
Rogers
,
A.
,
Kovaleva
,
O.
, and
Rumshisky
,
A.
,
2020
, “
A Primer in BERTology: What We Know About How BERT Works
,”
Trans. Assoc. Comput. Linguist.
,
8
(
1
), pp.
842
866
.
22.
Lee
,
J.
,
Lee
,
Y.
,
Kim
,
J.
,
Kosiorek
,
A.
,
Choi
,
S.
, and
Teh
,
Y. W.
,
2019
, “
Set Transformer: A Framework for Attention-Based Permutation-Invariant Neural Networks
,”
36th International Conference on Machine Learning, PMLR
,
Long Beach, CA
,
June 10–15
, pp.
3744
3753
.
23.
Radford
,
A.
,
Kim
,
J. W.
,
Hallacy
,
C.
,
Ramesh
,
A.
,
Goh
,
G.
,
Agarwal
,
S.
,
Sastry
,
G.
, et al
,
2021
,
Learning Transferable Visual Models From Natural Language Supervision
, February.
24.
Rombach
,
R.
,
Blattmann
,
A.
,
Lorenz
,
D.
,
Esser
,
P.
, and
Ommer
,
B.
,
2022
, “
High-Resolution Image Synthesis With Latent Diffusion Models
,”
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
,
New Orleans, LA
,
June 18–24
, pp.
10684
10695
.
25.
Tangelder
,
J.
, and
Veltkamp
,
R.
,
2004
, “
A Survey of Content Based 3D Shape Retrieval Methods
,”
Shape Modeling Applications
,
Genova, Italy
,
June 7–9
, pp.
145
156
.
26.
Fisher
,
M.
, and
Hanrahan
,
P.
,
2010
, “
Context-Based Search for 3D Models
,”
ACM Trans. Graph.
,
29
(
6
), pp.
1
10
.
27.
Funkhouser
,
T.
,
Min
,
P.
,
Kazhdan
,
M.
,
Chen
,
J.
,
Halderman
,
A.
,
Dobkin
,
D.
, and
Jacobs
,
D.
,
2003
, “
A Search Engine for 3D Models
,”
ACM Trans. Graph.
,
22
(
1
), pp.
83
105
.
28.
Salton
,
G.
,
1991
, “
The Smart Document Retrieval Project
,”
14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR
,
Chicago, IL
,
Oct. 13–16
, Association for Computing Machinery, pp.
356
358
.
29.
Rocchio
,
J. J.
,
1971
, “
Relevance Feedback in Information Retrieval
,”
The Smart Retrieval System – Experiments in Automatic Document Processing
,
Salton
,
G.
, ed.,
Prentice-Hall
,
Englewood Cliffs, NJ
, pp.
313
323
.
30.
Yi
,
L.
,
Guibas
,
L.
,
Hertzmann
,
A.
,
Kim
,
V. G.
,
Su
,
H.
, and
Yumer
,
E.
,
2017
, “
Learning Hierarchical Shape Segmentation and Labeling From Online Repositories
,”
SIGGRAPH
,
Los Angeles, CA
,
July 30–Aug. 3
.
31.
Chang
,
A. X.
,
Funkhouser
,
T.
,
Guibas
,
L.
,
Hanrahan
,
P.
,
Huang
,
Q.
,
Li
,
Z.
,
Savarese
,
S.
, et al
,
2015
, “
Shapenet: An Information-Rich 3D Model Repository
”. arXiv:1512.03012.
32.
Chen
,
K.
,
Choy
,
C.
,
Savva
,
M.
,
Chang
,
A.
,
Funkhouser
,
T.
, and
Savarese
,
S.
,
2019
, “
Text2shape: Generating Shapes From Natural Language by Learning Joint Embeddings
,”
Computer Vision – ACCV 2018 – 14th Asian Conference on Computer VisionRevised Selected Papers
,
Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
,
Li
,
H.
,
Jawahar
,
C.
,
Schindler
,
K.
,
Mori
,
G.
, et al
,
Springer Verlag
, pp.
100
116
. Funding Information: Acknowledgments. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE – 1147470. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This work is supported by Google, Intel, and with the support of the Technical University of Munich–Institute for Advanced Study, funded by the German Excellence Initiative and the European Union Seventh Framework Programme under grant agreement no. 291763. Publisher Copyright: Ⓒ 2019, Springer Nature Switzerland AG.; 14th Asian Conference on Computer Vision, ACCV 2018 ; Conference date: December 02, 2018 to December 06, 2018.
33.
Han
,
Z.
,
Shang
,
M.
,
Wang
,
X.
,
Liu
,
Y.-S.
, and
Zwicker
,
M.
,
2019
, “
Y2seq2seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
,”
Proc. AAAI Conf. Artif. Intell.
,
33
(
01
), pp.
126
133
.
34.
Ramesh
,
A.
,
Pavlov
,
M.
,
Goh
,
G.
,
Gray
,
S.
,
Voss
,
C.
,
Radford
,
A.
,
Chen
,
M.
, and
Sutskever
,
I.
,
2021
, “
Zero-Shot Text-to-Image Generation
,”
International Conference on Machine Learning, PMLR
,
Online/Virtual
,
July
, pp.
8821
8831
.
35.
Dayma
,
B.
,
Patil
,
S.
,
Cuenca
,
P.
,
Saifullah
,
K.
,
Abraham
,
T.
,
Le Khac
,
P.
,
Melas
,
L.
, and
Ghosh
,
R.
,
2021
, Dalle Mini, 7.
36.
Sanghi
,
A.
,
Chu
,
H.
,
Lambourne
,
J. G.
,
Wang
,
Y.
,
Cheng
,
C.-Y.
,
Fumero
,
M.
, and
Malekshan
,
K. R.
,
2022
, “
Clip-Forge: Towards Zero-Shot Text-to-Shape Generation
,”
IEEE/CVF Conference on Computer Vision and Pattern Recognition
,
New Orleans, LA
,
June 18–24
.
37.
Khalid
,
N.
,
Xie
,
T.
,
Belilovsky
,
E.
, and
Popa
,
T.
,
2022
, “
Clip-Mesh: Generating Textured Meshes From Text Using Pretrained Image-Text Models
,” ACM Transactions on Graphics (TOG), Proceedings of SIGGRAPH Asia.
38.
Sanghi
,
A.
,
Fu
,
R.
,
Liu
,
V.
,
Willis
,
K.
,
Shayani
,
H.
,
Khasahmadi
,
A. H.
,
Sridhar
,
S.
, and
Ritchie
,
D.
,
2022
, “Textcraft: Zero-Shot Generation of High-Fidelity and Diverse Shapes From Text”. arXiv:2211.01427.
39.
Schlachter
,
K.
,
Ahlbrand
,
B.
,
Wang
,
Z.
,
Perlin
,
K.
, and
Ortenzi
,
V.
,
2022
, “
Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets
,”
SIGGRAPH Asia
,
EXCO, Daegu, South Korea
,
Dec. 6–9
.
40.
Szykman
,
S.
,
Sriram
,
R. D.
,
Bochenek
,
C.
,
Racz
,
J. W.
, and
Senfaute
,
J.
,
2000
, “
Design Repositories: Engineering Design’s New Knowledge Base
,”
IEEE Intell. Syst. Appl.
,
15
(
3
), pp.
48
55
.
41.
Bohm
,
M. R.
, and
Stone
,
R. B.
,
2004
, “
Product Design Support: Exploring a Design Repository System
,”
Computers and Information in Engineering Division, CED
,
Washington
,
November
, American Society of Mechanical Engineers, CED, pp.
55
65
.
42.
Bohm
,
M. R.
,
Stone
,
R. B.
,
Simpson
,
T. W.
, and
Steva
,
E. D.
,
2008
, “
Introduction of a Data Schema to Support a Design Repository
,”
CAD Comput. Aided Des.
,
40
(
7
), pp.
801
811
.
43.
Phelan
,
K.
,
Wilson
,
C.
, and
Summers
,
J. D.
,
2014
, “
Development of a Design for Manufacturing Rules Database for Use in Instruction of Dfm Practices
,”
ASME Design Engineering Technical Conference, Vol. 1A
,
Buffalo, NY
,
Aug. 17–20
, pp.
1
7
.
44.
Bharadwaj
,
A.
,
Xu
,
Y.
,
Angrish
,
A.
,
Chen
,
Y.
, and
Starly
,
B.
,
2019
, “
Development of a Pilot Manufacturing Cyberinfrastructure With an Information Rich Mechanical CAD 3D Model Repository
,”
14th International Manufacturing Science and Engineering Conference
,
Erie, PA
,
June 10–14
, Vol. 1, pp.
1
8
.
45.
Kurtoglu
,
T.
,
Campbell
,
M. I.
,
Bryant
,
C. R.
,
Stone
,
R. B.
, and
Mcadams
,
D. A.
,
2005
, “
Deriving a Component Basis for Computational Functional Synthesis
,”
15th International Conference on Engineering Design: Engineering Design and the Global Economy
,
Melbourne, Australia
,
Aug. 15–18
.
46.
Hirtz
,
J.
,
Stone
,
R. B.
,
McAdams
,
D. A.
,
Szykman
,
S.
, and
Wood
,
K. L.
,
2002
, “
A Functional Basis for Engineering Design: Reconciling and Evolving Previous Efforts
,”
Res. Eng. Des. Theor. Appl. Concurr. Eng.
,
13
(
2
), pp.
65
82
.
47.
Cheong
,
H.
,
Chiu
,
I.
,
Shu
,
L. H.
,
Stone
,
R. B.
, and
McAdams
,
D. A.
,
2011
, “
Biologically Meaningful Keywords for Functional Terms of the Functional Basis
,”
ASME J. Mech. Des
,
133
(
2
), p.
021007
.
48.
Ferrero
,
V.
,
2020
, PyDamp: Python-Based Data Addition and Management of PSQL databases.
49.
Miller
,
G. A.
,
1995
, “
WordNet
,”
Commun. ACM
,
38
(
11
), pp.
39
41
.
50.
Liu
,
H.
, and
Singh
,
P.
,
2004
, “
ConceptNet - A Practical Commonsense Reasoning Tool-Kit
,”
BT Technol. J.
,
22
(
4
), pp.
211
226
.
51.
Carlson
,
A.
,
Betteridge
,
J.
,
Kisiel
,
B.
,
Settles
,
B.
,
Hruschka Jr.
,
R. H.
, and
Mitchell
,
T. M.
,
2010
, “
Toward an Architecture for Never-Ending Language Learning
,”
Twenty-Fourth Conference on Artificial Intelligence
,
Atlanta, GA
,
June
.
52.
Mikolov
,
T.
,
Sutskever
,
I.
,
Chen
,
K.
,
Corrado
,
G.
, and
Dean
,
J.
,
2013
, “
Distributed Representations of Words and Phrases and Their Compositionality
,”
26th International Conference on Neural Information Processing Systems
,
Harrahs and Harveys, Lake Tahoe
,
December
, Curran Associates Inc., pp.
3111
3119
.
53.
Bian
,
S.
,
Grandi
,
D.
,
Hassani
,
K.
,
Sadler
,
E.
,
Borijin
,
B.
,
Fernandes
,
A.
, and
Wang
,
A.
,
2022
, “
Material Prediction for Design Automation Using Graph Representation Learning
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
St. Louis, MO
,
Aug. 14–17
, American Society of Mechanical Engineers, p. V03AT03A001.
54.
Bird
,
S.
,
Loper
,
E.
, and
Klein
,
E.
,
2009
,
Natural Language Processing With Python
,
O’Reilly Media Inc
,
Montgomery, IL
.
55.
Bojanowski
,
P.
,
Grave
,
E.
,
Joulin
,
A.
, and
Mikolov
,
T.
,
2017
, “
Enriching Word Vectors With Subword Information
,”
Trans. Assoc. Comput. Linguist.
,
5
(
1
), pp.
135
146
.
You do not currently have access to this content.