Robust and spontaneous supramolecular and macromolecular self-assembly
processes are poorly understood. These include helix packing, viral self-assembly,
protein crystallization, prion aggregation, ligand and drug docking etc..
To elucidate the structure and geometric properties of molecular assembly
configuration spaces, our new EASAL suite of algorithms and software combines
classical concepts in algebraic topology and recent results in the theory of configuration
spaces. These concepts and results formalize and allow leveraging the relative simplicity
of assembly spaces compared to folding spaces. Specifically, EASAL builds an
atlas (1) using a geometric-constraint representation to (2) extract a comprehensive
degree-of-freedom-based stratification of the assembly landscape and encode a
topological roadmap of neighborhood and boundary relationships between constant
potential energy regions of varying effective dimensions; (3) parameterizes regions and
their boundaries by a judicious, region-specific choice of Cayley (distance) coordinates,
typically resulting in convex domains. These parameterizations reliably and efficiently
isolate and enable sampling of crucial low-dimension regions, such as narrow regions
with low potential energy. The underlying theory and a principled algorithm design
provide a formal guarantee of correctness and efficiency.
By sampling the assembly landscape of 2 TransMembrane Helices, with short-range
pair potentials, this dissertation demonstrates that EASAL provides reasonable coverage of crucial but narrow regions of low effective dimension with much fewer samples
and computational resources than traditional Monte Carlo or Molecular Dynamics
based sampling. Promising avenues are discussed, for combining the complementary
advantages of the two methods.
Additionally, since accurate computation of configurational entropy and other
integrals is required for estimation of both free energy and kinetics, it is essential to
obtain uniform sampling in appropriate cartesian or moduli space parameterization.
Standard adjustment of Cayley sampling via the Jacobian of the map between the two
parameterizations is fraught with challenges stemming from an illconditioned Jacobian.
This dissertation formalizes and analyzes these challenges to provide modifications to
EASAL that secure the advantages of Cayley sampling while ensuring certain minimum
distance and coverage relationships between sampled configurations - in Cartesian
space. The modified EASALs performance is compared with the basic EASAL and the
data are presented for Human and Rat Islet Amylin Polypeptide (HiAPP, PDB-2KJ7
and RiAPP PDB-2KB8) dimerization (the two differ in only 6 out of 37 residues, but the
former aggregates into fibrils, while the latter does not).
Finally, we discuss algorithmic future problems concerning the removal of
exponential dependence on dimension. |