Displaying Biochemical Properties
- 1 Selecting secondary structures
- 2 Coloring
- 3 Bonds
- 4 Calculating dihedral angles
- 5 Cavities
- 6 Surface-Related
- 7 Backbones
- 8 Align proteins with CA fit
Selecting secondary structures
A few examples:
select helix, (ss h) select sheet, (ss s) select loop, (ss l+'')
Manually Assigning Secondary Structure
You can manually assign secondary stuctures to your protein by
alter 96-103/, ss='S' alter 96-103/, ss='H' alter 96-103/, ss='L'
to set residues 96-103 to beta Strand, alpha Helix, and Loop respectively.
See also Category:Coloring.
Color by atom type from a script
See Color for this.
Assign color by B-factor
See section Color for this.
Representation-independent color control
See section Surface#Representation-independent Color Control.
PyMOL can deduce bonds from the PDB structure file, even if the CONECT records are missing. In fact, PyMOL guesses bonding connectivity based on proximity, based on the empirical observation that two atoms of a given radius will not be generally closer than a certain distance unless they are bonded.
Displaying double bonds
You can go into the lines mode and turning on the valence display:
hide as lines set valence, 0.1
A higher value for valence spreads things out more. I don't know of a way to get the dotted notation.
Hydrogen bonds and Polar Contacts
Using the actions [A] button for an object or selection you can display Hydrogen bonds and Polar Contacts. [A]->find->polar contacts-><select from menu>
The command behind the menus is the distance command called with the additional argument mode=2.
Parameters that control the the identification of H-bonds are defined as
set h_bond_cutoff_center, 3.6
with ideal geometry and
set h_bond_cutoff_edge, 3.2
with minimally acceptable geometry.
These settings can be changed *before* running the detection process (dist command mode=2 or via the menus).
Note that the hydrogen bond geometric criteria used in PyMOL was designed to emulate that used by DSSP.
Hydrogen bonds between specific atoms
dist name, sele1, sele2, mode=2
Hydrogen bonds where find->polar contacts doesn't do what you need
You can show H-bonds between two objects using atom selections so long as hydrogens are present in both molecules. If you don't have hydrogens, you can use h_add on the proteins, or provide ligands with valence information and then use h_add.
Two examples are below. For clarity, they draw dashes between the heavy atoms and hide the hydrogens.
# EXAMPLE 1: Show hydrogen bonds between protein # and docked ligands (which must have hydrogens) load target.pdb,prot load docked_ligs.sdf,lig # add hydrogens to protein h_add prot select don, (elem n,o and (neighbor hydro)) select acc, (elem o or (elem n and not (neighbor hydro))) dist HBA, (lig and acc),(prot and don), 3.2 dist HBD, (lig and don),(prot and acc), 3.2 delete don delete acc hide (hydro) hide labels,HBA hide labels,HBD
# EXAMPLE 2 # Show hydrogen bonds between two proteins load prot1.pdb load prot2.pdb h_add prot1 h_add prot2 select don, (elem n,o and (neighbor hydro)) select acc, (elem o or (elem n and not (neighbor hydro))) dist HBA, (prot1 and acc),(prot2 and don), 3.2 dist HBD, (prot1 and don),(prot2 and acc), 3.2 delete don delete acc hide (hydro) hide labels,HBA hide labels,HBD # NOTE: that you could also use this approach between two # non-overlapping selections within a single object.
The "polar contacts" mentioned above are probably better at finding hydrogen bonds than these scripts. "Polar contacts" check geometry as well as distance.
Generally speaking, PyMOL does not have sufficient information to rigorously determine hydrogen bonds, since typical PDB file are ambiguous with respect to charge states, bonds, bond valences, and tautomers. As it stands, all of those things are guessed heuristically. Rigorously determining the location of lone pair electrons and proton coordinates from raw PDB files is a nontrival problem especially when arbitrary small molecule structures are present. In addition, PyMOL would also need to consider the implied coordinate error due to overall structure resolution and local temperature factors before rigorously asserting than any specific hydrogen bond does or does not exist.
Furthermore, our hydrogen bond detection machinery was originally developed for purposes of mimicking Kabsch and Sander's DSSP secondary structure assignment algorithm (Biopolymers 22, 2577, 1983) which is based on a rather generous notion of hydrogen bonding (see Kabsch Figure 1).
Although this approximate capability can be accessed via the distance command using mode=2, the criteria applied by our implementation may be based on heavy-atom coordinates (only) and does not necessarily correspond to anything rigorous or published. So the bottom line is that PyMOL merely offers up putative polar contacts and leaves it to the user to determine whether or not the interactions present are in fact hydrogen bonds, salt bridges, polar interactions, or merely artifacts of incorrect assignments (i.e. two carbonyls hydrogen bonding because they're being treated like hydroxyls).
With respect to the h_bond_* settings, the angle in question for h_bond_cutoff_* and h_bond_max_angle is ADH, assuming H exists. If H does not exist, then PyMOL will guess a hypothetical coordinate which may not actually be valid (in plane, etc.). Tthe hydrogen must also lie within a cone of space with its origin on A (along B->A) and angular width h_bond_cone. Since h_bond_cone in 180 by default, the present behavior is to simply reject any hydrogen bond where the hydrogen would lie behind the plane defined by the acceptor atom (A) in relation to its bonded atom(s) B (if any). In other words, if B is only one atom (e.g. C=O vs. C-O-C), then by default, HAB cannot be less then 90 degrees.
The two h_bond_power_* settings are merely fitting parameters which enable PyMOL to reproduce a curve shape reflecting Kabsch Figure 1. The endpoints of the effective cutoff curve is a function of the two h_bond_cutoff_* setting.
Calculating dihedral angles
The get_dihedral function requires four single-atom selections to work:
get_dihedral prot1///9/C, prot1///10/N, prot1///10/CA, prot1///10/C
To calculate the surface area of a selection, see Get_Area.
Polar surface area
For a solvent accessible PSA approximation:
set dot_density, 3 remove hydro remove solvent show dots set dot_solvent, on get_area elem N+O get_area elem C+S get_area all
For molecular PSA approximation
set dot_density, 3 remove hydro remove solvent set dot_solvent, off get_area elem N+O get_area elem C+S get_area all
Showing dots isn't mandatory, but it's a good idea to confirm that you're getting the value for the atom dot surface you think you're using. Please realize that the resulting numbers are only approximate, reflecting the sum of partial surface areas for all the dots you see. To increase accuracy, set dot_density to 4, but be prepared to wait...
Display solvent accessible surface
Using the surface display mode, PyMOL doesn't show the solvent accessible surface, rather it shows the solvent/protein contact surface. The solvent accessible surface area is usually defined as the surface traced out by the center of a water sphere, having a radius of about 1.4 angstroms, rolled over the protein atoms. The contact surface is the surface traced out by the vdw surfaces of the water atoms when in contact with the protein.
PyMOL can show solvent accessible surfaces using the dot or sphere representations:
show dots set dot_mode,1 set dot_density,3
alter all,vdw=vdw+1.4 show spheres
Once the Van der Waals radii for the selection have been altered, the surface representation will also be "probe-inflated" to show a pseudo solvent accessible surface, as detailed above.
alter all,vdw=vdw+1.4 show surface
Note that to display both the molecular surface and the solvent-accessible surface, the object must be duplicated, as is done for Surface#Representation-independent Color Control. This also applies if the spheres representation is to be used to display "real" atoms.
Electric Field Lines
To produce an image with electric field lines, first run APBS. Then, input the following:
gradient my_grad, pymol-generated ramp_new my_grad_ramp, pymol-generated color my_grad_ramp, my_grad
Residues with functional groups
Poor man's solution: Display protein as surface, colorize all Lys (-NH2), Asp and Glu (-COOH) and Cys (-SH):
remove resn hoh # remove water h_add # add hydrogens as surface color grey90 color slate, resn lys # lysines in light blue color paleyellow, resn cys # cysteines in light yellow color tv_red, (resn asp or(resn glu)) # aspartic and glutamic acid in light red
Not-so-poor-man's solution: In order to have the functional groups better localized, only the central atoms can be colored:
- the S atom of cystein,
- the N and H atoms of the free amine of lysine (may be displayed with three H atoms at all three possible positions)
- the C and two O atoms of free carboxylic groups in aspartic and glutamic acid
In this way, they are better visible through the surface compared to only one colored atom, both amines and carboxylic groups consist of three colored atoms each.
remove resn hoh # remove water h_add # add hydrogens as surface color grey90 select sulf_cys, (resn cys and (elem S)) # get the sulfur atom of cystein residues color yellow, sulf_cys select nitro_lys, (resn lys and name NZ) # get the nitrogens of free amines ("NZ" in PDB file) select hydro_lys, (elem H and (neighbor nitro_lys)) # get the neighboring H atoms select amine_lys, (nitro_lys or hydro_lys) color tv_blue, amine_lys select oxy_asp, (resn asp and (name OD1 or name OD2)) # get the two oxygens of -COOH ("OD1", "OD2") select carb_asp, (resn asp and (elem C and (neighbor oxy_asp))) # get the connecting C atom select oxy_glu, (resn glu and (name OE1 or name OE2)) # oxygens "OE1" and "OE2" in PDB file select carb_glu, (resn glu and (elem c and (neighbor oxy_glu))) select carboxy, (carb_asp or oxy_asp or carb_glu or oxy_glu) color tv_red, carboxy
By displaying the protein as non-transparent surface, only the functional groups (colored atoms) at the surface are visible. The visualization of those groups can be pronounced by displaying the corresponding atoms as spheres, e.g. "as spheres, carboxy + amine_lys + sulf_cys", in this way it might become more clear how accessible they are.
When displaying the protein as cartoon, the functional groups can be shown as spheres, and the whole residues cys, lys, asp and glu as sticks connected to the backbone, with the atoms of the functional groups again as spheres. However, then also the not accessible residues inside the protein are visible.
Displaying the C-Alpha trace of proteins
hide show ribbon set ribbon_sampling,1
And if your model only contains CA atoms, you'll also need to issue:
Displaying the Amino Acid Backbone
The easiest way to see the backbone of the protein is to do
hide all show ribbon
If you don't like the ribbon representation, you can also do something like
hide all show sticks, name C+O+N+CA
You can replace sticks in the above by other representations like spheres or lines.
Displaying the Phosphate backbone of nucleic acids
Native Nucleic Acid Rendering in PyMol
PyMol now better supports viewing nucleic acid structure. Nuccyl still seems to be the reigning champ for image quality, but see PyMol's native Cartoon command. For more information on representing nucleic acids, please see the Nucleic Acids Category.
Should you ever want to show the phosphate trace of a nucleic acid molecule:
def p_trace(selection="(all)"): s = str(selection) cmd.hide('lines',"("+s+")") cmd.hide('spheres',"("+s+")") cmd.hide('sticks',"("+s+")") cmd.hide('ribbon',"("+s+")") cmd.show('cartoon',"("+s+")") cmd.set('cartoon_sampling',1,"("+s+")") cmd.set('cartoon_tube_radius',0.5,"("+s+")") cmd.extend('p_trace',p_trace)
Align proteins with CA fit
If two proteins have significant homology, you can use the Align command:
which will perform a sequence alignment of prot1 against prot2, and then an optimizing fit using the CA positions. I'm not sure if the help text for align got into 0.82, but the next version will definitely have it.