ZIB

ZIB Home

Visual Home Visual Contact Search Sitemap

Visualization and Data Analysis

ZIB  →  Divisions  →  Visualization  →  SciVis  →  Projects  →  MolVis

MolVis

Molecular Visualization

The understanding of biochemical processes requires the study of structure and dynamics of biomolecules of varying size. In these investigations vast amounts of data are generated both in experiments and in numerical simulations. An interplay between chemical classification, mathematical data reduction and graphical data representation is necessary to aid in uncovering the essential structure of biomolecular processes and making them understandable to the human observer. The aim of this project is to develop a software environment which integrates molecular dynamics simulation, visualization, and analysis.

teaser

Introduction

The central goal of molecular biology is to elucidate the relationship between sequence, structure, properties and function of biomolecules. Such knowledge allows one to understand the molecular mechanisms underlying biological or pharmaceutical processes. Furthermore, it enables one to identify drug candidates and optimize these. The two major steps in drug discovery are finding a chemical compound which shows a desired bioactivity and then modifying this drug candidate to build in all desirable properties.

The effectiveness of drugs depends on their shape and charge distribution. However, molecules are not static structures. They move, vibrate, and interact with other molecules and their environment. Understanding these movements and interactions is essential for the complete understanding of structure-function relationships, including many aspects of drug design and intermolecular interactions. The shapes of a molecule have to be understood as metastable conformations, i.e. areas in the high-dimensional coordinate space that the system stays in for a long time. In order to understand the bioactivity of a molecule, its different conformations and their probabilities of occurrence have to be considered. Regarding drug design, for instance, it may be that one conformation does not fit into the active site, while another one fits perfectly. Even more, a biochemical function may be the result of a conformational change.

The computational identification of molecular conformations is a difficult task, but it is possible for small and medium-sized molecules. This analysis yields approximations of metastable conformations in form of finite sets of corresponding molecular geometries.

Our central aim is to visualize metastable conformations in such a way that one is able to understand the differences between the various shapes a molecule may adopt. Furthermore, we are interested in depicting the fuzziness of these shapes, i.e. the locally differing amounts of flexibility and rigidity. We developed methods

  • for selecting realistic molecular geometries representing metastable conformations,
  • for representing the probability density of molecular shapes by an accumulated probability density of all parts of a molecule,
  • for fast computation of this probability density, applicable also to large sets of molecular geometries,
  • for depicting this probability density in combination with standard as well as newly developed molecular visualization techniques,
  • for multiple semi-flexible superposition of molecules on the basis of conformational representatives of the molecules.
All these techniques have been implemented in the 3D visualization and 3D image analysis system amira.

amiraMol

In order to be flexible in developing new algorithms for the visualization and analysis of static molecules as well as of dynamic molecular data from simulations, we started integrating molecular visualization techniques into the visualization system amira. This was the hour of birth of amiraMol. Since then most of the standard visualization techniques for molecules have been added to amiraMol. Furthermore, the molecular surface and interface modules in amiraMol are superior to those found in many other molecular visualization systems. With amiraMol's selection browser, the user can easily combine many visualization techniques into one image. It allows him or her to hide less interesting parts and emphasize the one's he or she is most interested in. amiraMol also offers basic facilities for manipulating molecules. However, the main focus has been put on supporting dynamic data and visualizing metastable molecular conformations. Please see the next section for more information. Below you find a few pictures genrated with amiraMol.

Balls and sticks / Bond-angles Secondary structures Molecular surface Molecular interface
Ball-and-stick / Bond-angle Secondary structures Molecular surface Molecular interface

Metastable Molecular Conformations

The phrase `metastable conformation' indicates a dynamic aspect of molecular behavior: it denotes metastable shapes, i.e. molecular geometries which survive the fast oscillations around equilibrium positions, this means configurations whose trajectory remains inside a set for a long time period before leaving it eventually. In mathematical terms a metastable conformation therefore is an almost invariant set of the ensemble.

Representatives of Conformations

A metastable conformation can be understood as a fuzzy molecular shape, i.e. a set of molecular shapes that fills a finite part of the space of molecular geometries. One way of visualizing this set of shapes is to depict some representative geometry, that can be interpreted as a shape around which the other geometries are distributed. Representatives can also be used to compute an alignment between two metastable conformations using the techniques mentioned above.

Mean and representative shape

An approximation of such a representative shape can be computed by first performing an alignment and then averaging every Cartesian atom coordinate over all geometries. This procedure yields mean coordinates for all atoms that generally do not define a realistic shape. Parts of the molecule that are very flexible, especially parts performing big rotations, will be distorted in the mean geometry (cf. red shape in the image on the left).

To get a realistic shape we can now search for the geometry with the smallest distance to the mean shape. Using representatives gained by the described procedure we have a way to compare metastable conformations according to the molecular shapes they describe (cf. blue shape in the image on the left).

Conformational Density

Although a conformation representative is an important means for understanding the essential properties of a conformation, it is obviously of limited value. The information about the flexibility of the conformation is lost. What is needed is some kind of probability density of the location of the molecule. One possibility is to accumulate the density of the volume filling primitives abstracting the molecule, namely cylinders and spheres. We call such a density conformational density, since we accumulate the density over all molecular geometries of a metastable conformation, i.e. different representations of the same molecule. Here the problem is, that we normally have to deal with arbitrary translations and rotations of the geometries in a molecular dynamics trajectory. Thus, we need to transform all geometries into a common coordinate system. This can be done by using the alignment procedures mentioned above.

The computed densities can be visualized by direct volume rendering or isosurfaces. The following example images illustrate two conformations of Epigallocatechin-gallat.

First conformation Both conformations Second conformation
First conformation
Direct volume rendering
Both conformations
Isosurfaces
Second conformation
Direct volume rendering

Alignment

A crucial difficulty of conformation visualization lies in the conversion between relative and absolute coordinates. A single molecular geometry is specified by three-component Cartesian coordinate vectors for each atom. These geometries can be used for a direct visualization of the molecule, e.g. by balls and sticks.

On the other hand all geometric comparisons and distance measurements between molecular geometries, as they are performed during conformation analysis, rely on relative coordinates, i.e. bond lengths, bond angles, and torsion angles. In other words, conformations are independent of any rigid transformations that are previously applied to single geometries.

However, the visualization of conformations takes place in Cartesian coordinates. Therefore, it is necessary to assign global positions and orientations to the geometries and thereby to define a relative alignment between them.

We implemented two kinds of alignment based on minimization of mean squared pairwise atom distances. The first method, Ordinary Partial Procrustes Analysis, requires the choice of one reference configuration. The alignment then minimizes for every other configuration its distance to this reference. The second method, Generalized Partial Procrustes Analysis (GPPA), minimizes the sum of distances of all pairs of configurations. Therefore, it is independent of the choice of a reference.

For a group of geometries that can be interpreted as variations of a single basic form, GPPA is feasible. However, as the geometries from different metastable conformations have to be interpreted as variations of multiple substantially different forms, we need a more sophisticated alignment method for the visualization of metastable conformations.

Therefore, we introduced a method for the identification of similarities and differences between sub-geometries from different metastable sets. From this information we can derive a classification of all geometries that states which geometries can be treated as similar with respect to which atoms. The classification depends on the choice of an atom group that has approximately the same form in all geometries. Based on this choice, it is possible to define a visualization focus.

Further, we introduced a new objective function for the alignment of molecular geometries which incorporates the above mentioned classification. Thus, it considers similarities as well as substantial differences between the conformations. To find an alignment that minimizes the new assessment function, we derived an algorithm that is formally analog to the Generalized Partial Procrustes Analysis. Using this new alignment method, we can now visualize and compare multiple metastable conformations in one common density depiction.

The following images show a sampling consisting of 241760 forms of the molecule BSI and approximating 192 metastable sets. The forms were aligned using the metastability aware alignment method described above. The left image shows the superposition of the 192 average forms of BSI's metastable sets. Based on the mentioned alignment Cartesian atom coordinates were averaged per metastable set. The 192 metastable sets have 12 chemically distinct forms. The right image shows the corresponding densities in different colors.

First                                 conformation Both conformations
Average molecules
192 metastable sets of BSI
Cartesian Form Densities
12 chemically distinct densities

Skeleton-based Alignment of Drug-like Molecules

Algorithms for the comparison of molecules are important tools in Virtual Screening. Consider the case where one has a couple of active drugs, but no receptor structure. This is often the case, since the crystallization of proteins is still a difficult task. In this case, one might be interested in the properties that render a drug active. Since the 3-dimensional structure plays a major role in the binding process, it is important to base this comparison on the structural properties of the drug-molecules. Since molecules can adopt different shapes, we need to take the flexibility of the molecules into account. We do this by applying a semi-flexible strategy, i.e., we consider more than one conformer per molecule. Also, in the described case we are not interested in the best superposition of pairs of molecules, but in good superpositions of more than two molecules, possibly all molecules of the considered set. Therefore, we developed a multiple semi-flexible algorithm for the superposition of drug-like molecules. A more detailed description of the algorithm can be found in the following subsections. The whole algorithm is described here.

Pairwise Superpositions

The pairwise comparison of two molecular structures is done by first computing a number of start transformations, which are locally optimized using an iterative point matching algorithm. We use a scoring function that represents a trade-off between the size of a matching, i.e., the number of matched atoms, and the goodness of fit. Additionally, we weight the goodness of fit according to the type of the matched atoms.
Two matchings of the same molecular conformers of two angiotensin-II-antagonists. In the left image, the upper double ring is matched, whereas in the right image, the functional group in the lower left half is matched instead.

Multiple Semi-flexible Superposition

The multiple superposition requires to specify a reference molecule. All other molecules, called query molecules, are compared to this reference molecule. This is done by computing all pairwise superpositions of all conformers of the query molecules with each reference conformer separately. The pairwise matchings of all the conformers of a single query molecule with a single reference conformer are stored in a data structure called matching tree. This data structure allows to efficiently merge the matching trees of different molecules corresponding to the same reference conformer, resulting in multiple matchings.
Two multiple superpositions of four angiotensin-II-antagonists. Here, different conformers of the molecules were superposed. The numbers 1 and 2 denote two carbon-rings.

Surface-based Alignment of Drug-like Molecules

The skeleton-based approach described above has the disadvantage, that it only works for molecules with similar skeletons. However, what an enzyme sees from a ligand is not the molecular skeleton but some kind of outer shape, which can be approximated by the molecular surface.

We have developed a new approach to molecular surface alignment based on point matching. In order to employ point matching, we approximate the molecular surface by points which we distribute regularly on the molecular surface. This allows us to easily incorporate several molecular properties by adding more points to the representation.

Surface Represenation Using Points

The point-based representation of the molecular surface is generated iteratively. A start representation is calculated using an efficient mesh partitioning scheme which partitions the molecular surface into connected patches. Into each such patch we place a single point (see left image below). This initial point representation is then relaxed using centroidal Voronoi tesselation, which we extended to triangular meshes (see right image below). For more information, please see the original publication.

Left: Initial point positions due to mesh partitioning. Right: Point positions after 50 point relaxation steps. Here, the patches denote the discrete Voronoi tesselation w.r.t. the points. Note the much more regular point distribution.

Pairwise Surface Alignment

Once points have been distributed on the surface for all molecular properties of interest, such as electrostatic potential, donor/acceptor regions, and shape, we can apply point matching to these points. In order to align molecular surfaces, we first generate a number of initial alignments, each of which is locally optimized using an iterative point matching scheme. We have developed an efficient algorithm and data structure to speed up the computation of point matchings for these possibly very large numbers of points. For details, please see the original publication.

Point-based surface alignment of two HIV-1 protease inhibitors, tipranavir and amprenavir. The surfaces represent isosurfaces of conformational densities of two metastable molecular conformations. The color denotes the averaged electrostatic potential. The points represent those parts of the surfaces (and properties) that have been matched. Left: view from the side. Right: view from above.

Publications

Organizational Details

Members

Responsible

Duration

01/1998 -

Funding

Partners

© Zuse Institute Berlin 2010 Imprint