The notion of optimization is inherent in the design of a sequence of amino acid monomer types in a long heteropolymer chain of a protein that should fold to a desired conformation. Building upon our previous work wherein continuous parametrization and deterministic optimization approach were introduced for protein sequence design, in this paper we present an alternative formulation that leads to a quadratic programming problem in the first stage of a two-stage design procedure. The new quadratic formulation, which uses the linear interpolation of the states of the monomers in Stage I could be solved to identify the globally optimal sequence(s). Furthermore, the global minimum solution of the quadratic programming problem gives a lower bound on the energy for a given conformation in the sequence space. In practice, even a local optimization algorithm often gives sequences with global minimum, as demonstrated in the examples considered in this paper. The solutions of the first stage are then used to provide an appropriate initial guess for the second stage, where a rescaled Gaussian probability distribution function-based interpolation is used to refine the states to their original discrete states. The performance of this method is demonstrated with HP (hydrophobic and polar) lattice models of proteins. The results of this method are compared with the results of exhaustive enumeration as well as our earlier method that uses a graph-spectral method in Stage I. The computational efficiency of the new method is also demonstrated by designing HP models of real proteins. The method outlined in this paper is applicable to very large chains and can be extended to the case of multiple monomer types.

1.
Kim
,
M. K.
,
Li
,
W.
,
Shapiro
,
B. A.
, and
Chirikjian
,
G. S.
, 2003, “
A Comparison Between Elastic Network Interpolation and MD Simulation of 16S Ribosomal RNA
,”
J. Biomol. Struct. Dyn.
0739-1102,
21
, pp.
395
405
.
2.
Kazerounian
,
K.
, 2004, “
From Mechanisms and Robotics to Protein Confrmation and Drug Design
,”
ASME J. Mech. Des.
1050-0472,
126
, pp.
40
45
.
3.
Dubey
,
A.
,
Sharma
,
G.
,
Mavroidis
,
C.
,
Tomassone
,
M. S.
,
Nikitczuk
,
K.
, and
Yarmush
,
M. L.
, 2004, “
Computational Studies of Viral Protein Nano-Actuator
,”
J. Comput. Theor. Nanosci.
1546-1955,
1
, pp.
18
28
.
4.
Lesk
,
A. M.
, 2001,
Introduction to Protein Architecture
,
1st ed.
,
Oxford University Press
, Oxford, NY.
5.
Miyazawa
,
S.
, and
Jernigan
,
R.
, 1985, “
Estimation of Effective Inter-Residue Contact Energies From Protein Crystal Structures
,”
Macromolecules
0024-9297,
18
, pp.
534
552
.
6.
Anfinsen
,
C.
, 1973, “
Principles That Govern the Folding of Protein Chains
,”
Science
0036-8075,
181
, pp.
223
230
.
7.
Zou
,
J.
, and
Saven
,
J. G.
, 2000, “
Statistical Theory of Combinatorial Libraries of Folding Proteins
,”
J. Mol. Biol.
0022-2836,
296
, pp.
281
294
.
8.
Sun
,
S.
,
Brem
,
R.
,
Chan
,
H. S.
, and
Dill
,
K. A.
, 1995 “
Designing Amino Acid Sequences to Fold With Good Hydrophobic cores
,”
Protein Eng.
0269-2139,
8
, pp.
1205
1213
.
9.
Jones
,
D. T.
, 1994, “
De Novo Protein Design Using Pairwise Potentials and Genetic Algorithm
,”
Protein Sci.
0961-8368,
3
, pp.
567
574
.
10.
Pande
,
V. S.
,
Grosberg
,
A. Y.
, and
Tanaka
,
T.
, 1994, “
Protein Superfamilies and Domain Superfolds
,”
Nature (London)
0028-0836,
372
, pp.
631
634
.
11.
Hellinga
,
H. W.
, and
Richards
,
F. M.
, 1994, “
Optimal Selection of Sequences of Proteins of Known Structure by Simulated Evolution
,”
Proc. Natl. Acad. Sci. U.S.A.
0027-8424,
91
, pp.
5803
5807
.
12.
Saven
,
J. G.
, and
Wolynes
,
P. G.
, 1997, “
Statistical Mechanics of the Combinatorial Synthesis and Analysis of Folding Macromolecules
,”
J. Phys. Chem. B
1089-5647,
101
, pp.
8375
8389
.
13.
Shaknovich
,
E. I.
, and
Gutin
,
A. M.
, 1993, “
Engineering of Stable and Fast-Folding Sequences of Model Proteins
,”
Proc. Natl. Acad. Sci. U.S.A.
0027-8424,
90
, pp.
7195
7199
.
14.
Park
,
S.
,
Yang
,
X.
, and
Saven
,
J.
, “
Advances in Computational Protein Design
,”
Curr. Opin. Struct. Biol.
0959-440X (submitted).
15.
Singh
,
M.
, “
Computational Methods Towards Predicting Aspects of Protein Structure and Interactions
,”
Special Session on Geometry of Protein Modeling in 248th Regional Meeting of the American Mathematical Society
, Lawrenceville, NJ, 17–19 April 2004.
16.
Koh
,
S. K.
,
Ananthasuresh
,
G. K.
, and
Vishveshwara
,
S.
, 2005, “
A Deterministic Optimization Approach to Protein Sequence Design Using Continuous Models
,”
Int. J. Robot. Res.
0278-3649,
24
, pp.
109
130
.
17.
Sanjeev
,
B. S.
,
Patra
,
S. M.
, and
Vishveshwara
,
S.
, 2001, “
Sequence Design in Lattice Models by Graph Theoretical Methods
,”
J. Chem. Phys.
0021-9606,
114
, pp.
1906
1914
.
18.
Lo
,
C.
, and
Papalambros
,
P. Y.
, 1995, “
On Global Feasible Search for Global Design Optimization with Application to Generalized Polynomial Models
,”
ASME J. Mech. Des.
1050-0472
117
, pp.
402
408
.
19.
Lo
,
C.
, and
Papalambros
,
P. Y.
, 1996a, “
A Deterministic Global Design Optimization Method for Nonconvex Generalized Polynomial Problems
,”
ASME J. Mech. Des.
1050-0472,
118
, pp.
75
81
.
20.
Lo
,
C.
, and
Papalambros
,
P. Y.
, 1996b “
A Convex Cutting Plane Algorithm for Global Solution of Generalized Polynomial Optimal Design Models
,”
ASME J. Mech. Des.
1050-0472,
118
, pp.
82
88
.
21.
Tuy
,
H.
, and
Thuong
,
N. V.
, 1988, “
On the Global Minimization of a Convex Function Under General Nonconvex Constraints
,”
Appl. Math. Optim.
0095-4616,
18
, pp.
13
20
.
22.
Yue
,
K.
, and
Dill
,
K. A.
, 1992, “
Inverse Protein Folding Problem: Designing Polymer Sequences
,”
Proc. Natl. Acad. Sci. U.S.A.
0027-8424,
89
, pp.
4163
4167
.
23.
Lau
,
K. F.
, and
Dill
,
K. A.
, 1989, “
A Lattice Statistical Mechanics Model of the Conformational and Sequence Spaces of Proteins
,”
Macromolecules
0024-9297,
22
, pp.
3986
3997
.
24.
Dill
,
K. A.
,
Bromberg
,
S.
,
Yue
,
K.
,
Fiebig
,
K. M.
,
Yee
,
D. P.
,
Thomas
,
P. D.
, and
Chan
,
H. S.
, 1995, “
Principles of Protein Folding: A Perspective from Simple, Exact Models
,”
Protein Sci.
0961-8368,
4
, pp.
561
602
.
25.
Bendsøe
,
M. P.
, and
Sigmund
,
O.
, 1999, “
Material Interpolation Scheme in Topology Optimization
,”
Arch. Appl. Mech.
0939-1533,
69
, pp.
635
654
.
26.
Li
,
H.
,
Tang
,
C.
, and
Wingreen
,
N. S.
, 1997, “
Nature of Driving Force for Protein Folding: A Result from Analyzing the Statistical Potential
,”
Phys. Rev. Lett.
0031-9007,
79
, pp.
765
768
.
27.
Rao
,
S. S.
,
Engineering Optimization
,
3rd ed
,
Wiley
, New York, 1996.
28.
Li
,
H.
,
Helling
,
R.
,
Tang
,
C.
, and
Wingreen
,
N.
, 1996, “
Emergence of Preferred Structures in a Simple Model of Protein Folding
,”
Science
0036-8075,
273
, pp.
666
669
.
29.
Matlab
, 2004,
Numerical Analysis Software from Mathworks, Inc.
, Woburn, MA, www.mathworks.comwww.mathworks.com.
30.
Yin
,
L.
, and
Ananthasuresh
,
G. K.
, 2001, “
Topology Optimization of Compliant Mechanisms with Multiple Materials Using a Peak Function Material Interpolation Scheme
,”
Struct. Multidiscip. Optim.
1615-147X,
23
, pp.
49
62
.
31.
Yin
,
L.
, and
Ananthasuresh
,
G. K.
, 2002, “
Novel Design Technique for Electro-Thermally Actuated Compliant Micromechanisms
,”
Sens. Actuators, A
0924-4247,
97–98
, pp.
599
609
.
32.
Berman
,
H. M.
,
Westbrook
,
J.
,
Feng
,
Z.
,
Gilliland
,
G.
,
Bhat
,
T. N.
,
Weissig
,
H.
,
Shindyalov
,
I. N.
, and
Bourne
,
P. E.
, 2001, “
The Protein Data Bank
.”
Nucleic Acids Res.
0305-1048 ,
28
, pp.
235
242
. Also see: http://www.pdb.orghttp://www.pdb.org
You do not currently have access to this content.