Abstract

In this work, a Variational Autoencoder (VAE)-based data-driven modeling framework is developed with the overarching goal of enabling fuel design. The VAE model is trained on a large dataset with several chemical species to learn a compressed latent space molecular representation. Chemical structure in the form of Simplified Molecular Input Line Entry System (SMILES) string is fed as input, encoded into the VAE latent space, and decoded back to the SMILES string using Long Short-Term Memory (LSTM) networks. Complexities of the VAE training loss function are thoroughly examined by varying the weightage (beta (𝜷) parameter) of the latent space regularization term, thereby assessing the balance between reconstruction accuracy and validity, and focusing on both accurate molecular structure reconstruction and latent space consistency. Two different strategies for 𝜷 variation are evaluated: linear annealing and cyclic annealing. In addition, the impact of total correlation adjustment and hierarchical priors is also studied with regard to the balance between reconstruction fidelity and latent space regularization, and potential issues such as posterior collapse, over-regularization, and poor disentanglement of latent variables. Overall, the best performance of the model is achieved with hierarchical priors and incrementally increasing 𝜷 from 0 to a threshold value of 0.25 over 75 epochs. The generative VAE model can be readily coupled with Quantitative Structure–Property Relationship (QSPR) analysis to develop an integrated end-to-end framework for fuel-property prediction and molecular design of novel promising fuels.

References

1.
Rodríguez-Fernández
,
J.
,
Ramos
,
Á
,
Barba
,
J.
,
Cárdenas
,
D.
, and
Delgado
,
J.
,
2020
, “
Improving Fuel Economy and Engine Performance Through Gasoline Fuel Octane Rating
,”
Energies
,
13
(
13
), p.
3499
.
2.
Prakash
,
A.
,
Wang
,
C.
,
Janssen
,
A.
,
Aradi
,
A.
, and
Cracknell
,
R.
,
2017
, “
Impact of Fuel Sensitivity (RON-MON) on Engine Efficiency
,”
SAE Int. J. Fuels Lubr.
,
10
(
1
), pp.
115
125
.
3.
Blanchard
,
A. E.
,
Stanley
,
C.
, and
Bhowmik
,
D.
,
2021
, “
Using GANs With Adaptive Training Data to Search for New Molecules
,”
J. Cheminform.
,
13
(
1
), pp.
1
8
.
4.
Liu
,
Y.
,
Liu
,
R.
,
Duan
,
J.
,
Wang
,
L.
,
Zhang
,
X.
, and
Li
,
G.
,
2023
, “
Deep Generative Fuel Design in Low Data Regimes via Multi-Objective Imitation
,”
Chem. Eng. Sci.
,
274
, p.
118686
.
5.
Kuzhagaliyeva
,
N.
,
Horváth
,
S.
,
Williams
,
J.
,
Nicolle
,
A.
, and
Sarathy
,
S. M.
,
2022
, “
Artificial Intelligence-Driven Design of Fuel Mixtures
,”
Commun. Chem.
,
5
(
1
), p.
111
.
6.
Yalamanchi
,
K. K.
,
Kommalapati
,
S.
,
Pal
,
P.
,
Kuzhagaliyeva
,
N.
,
AlRamadan
,
A. S.
,
Mohan
,
B.
,
Pei
,
Y.
,
Sarathy
,
S. M.
,
Cenker
,
E.
, and
Badra
,
J.
,
2023
, “
Uncertainty Quantification of a Deep Learning Fuel Property Prediction Model
,”
Appl. Energy Combust. Sci.
,
16
, p.
100211
.
7.
Kubic
, Jr.,
W. L.
,
Jenkins
,
R. W.
,
Moore
,
C. M.
,
Semelsberger
,
T. A.
, and
Sutton
,
A. D.
,
2017
, “
Artificial Neural Network Based Group Contribution Method for Estimating Cetane and Octane Numbers of Hydrocarbons and Oxygenated Organic Compounds
,”
Ind. Eng. Chem. Res.
,
56
(
42
), pp.
12236
12245
.
8.
vom Lehn
,
F.
,
Brosius
,
B.
,
Broda
,
R.
,
Cai
,
L.
, and
Pitsch
,
H.
,
2020
, “
Using Machine Learning With Target-Specific Feature Sets for Structure-Property Relationship Modeling of Octane Numbers and Octane Sensitivity
,”
Fuel
,
281
, p.
118772
.
9.
Liu
,
Z.
,
Zhang
,
L.
,
Elkamel
,
A.
,
Liang
,
D.
,
Zhao
,
S.
,
Xu
,
C.
,
Ivanov
,
S. Y.
, and
Ray
,
A. K.
,
2017
, “
Multiobjective Feature Selection Approach to Quantitative Structure Property Relationship Models for Predicting the Octane Number of Compounds Found in Gasoline
,”
Energy Fuels
,
31
(
6
), pp.
5828
5839
.
10.
Li
,
R.
,
Herreros
,
J. M.
,
Tsolakis
,
A.
, and
Yang
,
W.
,
2021
, “
Machine Learning-Quantitative Structure Property Relationship (ML-QSPR) Method for Fuel Physicochemical Properties Prediction of Multiple Fuel Types
,”
Fuel
,
304
, p.
121437
.
11.
SubLaban
,
A.
,
Kessler
,
T. J.
,
Van Dam
,
N.
, and
Mack
,
J. H.
,
2023
, “
Artificial Neural Network Models for Octane Number and Octane Sensitivity: A Quantitative Structure Property Relationship Approach to Fuel Design
,”
ASME J. Energy Resour. Technol.
,
145
(
10
), p.
102302
.
12.
Mohan
,
B.
, and
Chang
,
J.
,
2024
, “
Chemical SuperLearner (ChemSL)-An Automated Machine Learning Framework for Building Physical and Chemical Properties Model
,”
Chem. Eng. Sci.
,
294
, p.
120111
.
13.
Albahri
,
T. A.
,
2003
, “
Structural Group Contribution Method for Predicting the Octane Number of Pure Hydrocarbon Liquids
,”
Ind. Eng. Chem. Res.
,
42
(
3
), pp.
657
662
.
14.
Pal
,
P.
,
Kalvakala
,
K.
,
Wu
,
Y.
,
McNenly
,
M.
,
Lapointe
,
S.
,
Whitesides
,
R.
,
Lu
,
T.
,
Aggarwal
,
S. K.
, and
Som
,
S.
,
2021
, “
Numerical Investigation of a Central Fuel Property Hypothesis Under Boosted Spark-Ignition Conditions
,”
ASME J. Energy Resour. Technol.
,
143
(
3
), p.
032305
.
15.
Xu
,
C.
,
Pal
,
P.
,
Ren
,
X.
,
Sjöberg
,
M.
,
Van Dam
,
N.
,
Wu
,
Y.
,
Lu
,
T.
,
McNenly
,
M.
, and
Som
,
S.
,
2021
, “
Numerical Investigation of Fuel Property Effects on Mixed-Mode Combustion in a Spark-Ignition Engine
,”
ASME J. Energy Resour. Technol.
,
143
(
4
), p.
042306
.
16.
Liu
,
R.
,
Liu
,
R.
,
Liu
,
Y.
,
Wang
,
L.
,
Zhang
,
X.
, and
Li
,
G.
,
2022
, “
Design of Fuel Molecules Based on Variational Autoencoder
,”
Fuel
,
316
, p.
123426
.
17.
Schweidtmann
,
A. M.
,
Rittig
,
J. G.
,
König
,
A.
,
Grohe
,
M.
,
Mitsos
,
A.
, and
Dahmen
,
M.
,
2020
, “
Graph Neural Networks for Prediction of Fuel Ignition Quality
,”
Energy Fuels
,
34
(
9
), pp.
11395
11407
.
18.
Blum
,
L. C.
, and
Reymond
,
J. L.
,
2009
, “
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13
,”
J. Am. Chem. Soc.
,
131
(
25
), pp.
8732
8733
.
19.
Wang
,
Z.
,
2023
, “
Addressing Posterior Collapse in Variational Autoencoders With β-VAE
,”
Highlights Sci., Eng. Technol.
,
57
, pp.
161
167
.
20.
Torrey
,
L.
, and
Shavlik
,
J.
,
2010
, “Transfer Learning,”
Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques
,
IGI Global
,
Hershey, PA
, pp.
242
264
.
21.
Hochreiter
,
S.
, and
Schmidhuber
,
J.
,
1997
, “
Long Short-Term Memory
,”
Neural Comput.
,
9
(
8
), pp.
1735
1780
.
22.
Jin
,
W.
,
Barzilay
,
R.
, and
Jaakkola
,
T.
,
2018
, “
Junction Tree Variational Autoencoder for Molecular Graph Generation
,”
Proceedings of the 35th International Conference on Machine Learning
,
Stockholm, Sweden
,
July 10–15
, pp.
2323
2332
.
23.
Bento
,
A. P.
,
Hersey
,
A.
,
Félix
,
E.
,
Landrum
,
G.
,
Gaulton
,
A.
,
Atkinson
,
F.
,
Bellis
,
L. J.
,
De Veij
,
M.
, and
Leach
,
A. R.
,
2020
, “
An Open Source Chemical Structure Curation Pipeline Using RDKit
,”
J. Cheminform.
,
12
(
1
), pp.
1
16
.
24.
Yan
,
C.
,
Wang
,
S.
,
Yang
,
J.
,
Xu
,
T.
, and
Huang
,
J.
,
2020
, “
Re-Balancing Variational Autoencoder Loss for Molecule Sequence Generation
,”
Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
,
Virtual, USA
,
Sept. 21–24
, pp.
1
7
.
25.
Fu
,
H.
,
Li
,
C.
,
Liu
,
X.
,
Gao
,
J.
,
Celikyilmaz
,
A.
, and
Carin
,
L.
,
2019
, “Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing.” arXiv preprint arXiv:1903.10145.
26.
Klushyn
,
A.
,
Chen
,
N.
,
Kurle
,
R.
,
Cseke
,
B.
, and
van der Smagt
,
P.
,
2019
, “
Learning Hierarchical Priors in VAEs
,”
Proceedings of the 32nd International Conference on Neural Information Processing Systems
,
Vancouver, Canada
,
Dec. 8–14
, pp.
2866
2875
.
27.
Kingma
,
D. P.
, and
Welling
,
M.
,
2013
, “Auto-Encoding Variational Bayes.” arXiv preprint arXiv:1312.6114.
28.
Chen
,
R. T.
,
Li
,
X.
,
Grosse
,
R. B.
, and
Duvenaud
,
D. K.
,
2018
, “
Isolating Sources of Disentanglement in Variational Autoencoders
,”
Proceedings of the 31st International Conference on Neural Information Processing Systems
,
Long Beach, CA
,
Dec. 4–9
, pp.
2615
2625
.
You do not currently have access to this content.