Align: Difference between revisions

From PyMOLWiki
Jump to navigation Jump to search
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
'''align''' performs a sequence alignment followed by a structural alignment, and then carrys out zero or more cycles of refinement in order to reject structural outliers found during the fit.  For comparing proteins with lower sequence identity, an alignment program like, [[Cealign]] might be a better choice.
[[Image:after_alignment.png|400px|thumb|right|Two proteins after structure alignment]]
<gallery>
Image:before_alignment.png|Two unaligned proteins
Image:after_alignment.png|Two proteins after structure alignment
</gallery>


==== Algorithm Details ====
'''align''' performs a sequence alignment followed by a structural superposition, and then carries out zero or more cycles of refinement in order to reject structural outliers found during the fit.
'''align''' does a BLAST-like BLOSUM62-weighted dynamic programming sequence alignment followed by a series of refinement cycles intended to improve the fit by eliminating pairing with high relative variability (e.g. >2 standard deviations from the cycle's mean deviance).
[[align]] does a good job on proteins with decent sequence similarity (identity >30%).
For comparing proteins with lower sequence identity, the [[super]] and [[cealign]] commands perform better.


Your can write the final alignment to a file see [[save]].
== Usage ==


=== Super! ===
align mobile, target [, cutoff [, cycles
PyMOL now has another command -- '''[[super]]'''.  Super allows for much more robust alignments.  It's fast, and under testing, does MUCH better than the original '''align''' command.
    [, gap [, extend [, max_gap [, object
    [, matrix [, mobile_state [, target_state
    [, quiet [, max_skip [, transform [, reset ]]]]]]]]]]]]]


===USAGE===
== Arguments ==
<source lang="python">
 
align (source), (target) [,cutoff [,cycles [,gap [,extend \
* '''mobile''' = string: atom selection of mobile object
      [,skip [,object [,matrix [, quiet ]]]]]]]]
* '''target''' = string: atom selection of target object
</source>
* '''cutoff''' = float: outlier rejection cutoff in RMS {default: 2.0}
* '''cycles''' = int: maximum number of outlier rejection cycles {default: 5}
* '''gap, extend, max_gap''': sequence alignment parameters
* '''object''' = string: name of alignment object to create {default: (no alignment object)}
* '''matrix''' = string: file name of substitution matrix for sequence alignment {default: BLOSUM62}
* '''mobile_state''' = int: object state of mobile selection {default: 0 = all states}
* '''target_state''' = int: object state of target selection {default: 0 = all states}
* '''quiet''' = 0/1: suppress output {default: 0 in command mode, 1 in API}
* '''max_skip''' = ?
* '''transform''' = 0/1: do superposition {default: 1}
* '''reset''' = ?
 
== Alignment Objects ==
 
An alignment object can be created with the '''object='''''somename'' argument. An alignment object provides:
 
* aligned [[seq_view|sequence viewer]]
* graphical representation of aligned atom pairs as lines in the 3D viewer
* can be [[save|saved]] to a clustalw sequence alignment file
 
== RMSD ==
 
The RMSD of the aligned atoms (after outlier rejection!) is reported in the text output. The '''all-atom RMSD''' can be obtained by setting '''cycles=0''' and thus not doing any outlier rejection. The RMSD can also be captured with a python script, see the [[#PyMOL API|API paragraph]] below. Note that the output prints "RMS" but it is in fact "RMSD" and the units are Angstroms.
 
== Examples ==
 
<syntaxhighlight lang="python">
fetch 1oky 1t46, async=0
 
# 1) default with outlier rejection
align 1oky, 1t46
 
# 2) with alignment object, save to clustalw file
align 1oky, 1t46, object=alnobj
save alignment.aln, alnobj
 
# 3) all-atom RMSD (no outlier rejection) and without superposition
align 1oky, 1t46, cycles=0, transform=0
</syntaxhighlight>


===PYMOL API===
== PyMOL API ==
<source lang="python">
<source lang="python">
cmd.align( string mobile, string target, float cutoff=2.0,
cmd.align( string mobile, string target, float cutoff=2.0,
Line 38: Line 75:
# Number of residues aligned
# Number of residues aligned


===EXAMPLES===
== Notes ==
<source lang="python">
align  prot1////CA, prot2, object=alignment
</source>
 
===NOTE===
* If object is not None, then align will create an object which indicates which atoms were paired between the two structures


* <b>Important note: </b> the molecules you want to align need to be in two different objects. Else, PyMol will answer with a rather cryptic error:
* The molecules you want to align need to be in '''two different objects'''. Else, PyMOL will answer with: ''ExecutiveAlign: invalid selections for alignment.'' You can skirt this problem by making a temporary object and aligning your original to the copy.
* By defaults, '''all states''' (like in NMR structures or trajectories) are considered, this might yield a bad or suboptimal alignment for a single state. Use the '''mobile_state''' and '''target_state''' argument to be explicit in such cases.


<source lang="python">
== See Also ==
ExecutiveAlign: invalid selections for alignment.
</source>
You can skirt this problem by making a temporary object and aligning your original to the copy.
 
* Sometimes Align may appear to give a mediocre fit. This is not due to any shortcoming of the algorithm or Pymol for that matter. This usually happens if one or more of the objects, that you are trying to align, have multiple states. For instance, certain PDB files may contain multiple structures/ensembles of the same protein. This is especially true for PDB files containing NMR structures. The workaround in such a situation is to use this workflow (provided by Warren - thanks!):
<source lang="python">
set all_states, on
intra_fit <your_structure_1>
intra_fit <your_structure_2>
align <your_structure_1>////CA, <your_structure_2>////CA
</source>


===SEE ALSO===
* [[super]], [[cealign]], [[fit]], [[pair_fit]]
[[Cmd fit|fit]], [[Cmd rms|rms]], [[Cmd rms_cur|rms_cur]], [[Cmd intra_rms|intra_rms]], [[Cmd intra_rms_cur|intra_rms_cur]], [[Cmd pair_fit|pair_fit]], [[Cmd intra_fit|intra_fit]], [[Kabsch]], [[Cealign]], [[Color_by_conservation]], [[tmalign]], [[Extra_fit]], [http://pldserver1.biochem.queensu.ca/~rlc/work/pymol/ align_all.py and super_all.py].
* [[rms]], [[rms_cur]],  
* [[intra_fit]], [[intra_rms]], [[intra_rms_cur]]
* [[extra_fit]]
* [http://pldserver1.biochem.queensu.ca/~rlc/work/pymol/ align_all.py and super_all.py]
* [[tmalign]]
* [[Color_by_conservation]]
* [[Get_raw_alignment]]
* [[mcsalign]] (psico)


[[Category:Commands|Align]]
[[Category:Commands|Align]]
[[Category:Structure_Alignment|Align]]
[[Category:Structure_Alignment|Align]]

Latest revision as of 11:13, 6 December 2017

Two proteins after structure alignment

align performs a sequence alignment followed by a structural superposition, and then carries out zero or more cycles of refinement in order to reject structural outliers found during the fit. align does a good job on proteins with decent sequence similarity (identity >30%). For comparing proteins with lower sequence identity, the super and cealign commands perform better.

Usage

align mobile, target [, cutoff [, cycles
    [, gap [, extend [, max_gap [, object
    [, matrix [, mobile_state [, target_state
    [, quiet [, max_skip [, transform [, reset ]]]]]]]]]]]]]

Arguments

  • mobile = string: atom selection of mobile object
  • target = string: atom selection of target object
  • cutoff = float: outlier rejection cutoff in RMS {default: 2.0}
  • cycles = int: maximum number of outlier rejection cycles {default: 5}
  • gap, extend, max_gap: sequence alignment parameters
  • object = string: name of alignment object to create {default: (no alignment object)}
  • matrix = string: file name of substitution matrix for sequence alignment {default: BLOSUM62}
  • mobile_state = int: object state of mobile selection {default: 0 = all states}
  • target_state = int: object state of target selection {default: 0 = all states}
  • quiet = 0/1: suppress output {default: 0 in command mode, 1 in API}
  • max_skip = ?
  • transform = 0/1: do superposition {default: 1}
  • reset = ?

Alignment Objects

An alignment object can be created with the object=somename argument. An alignment object provides:

  • aligned sequence viewer
  • graphical representation of aligned atom pairs as lines in the 3D viewer
  • can be saved to a clustalw sequence alignment file

RMSD

The RMSD of the aligned atoms (after outlier rejection!) is reported in the text output. The all-atom RMSD can be obtained by setting cycles=0 and thus not doing any outlier rejection. The RMSD can also be captured with a python script, see the API paragraph below. Note that the output prints "RMS" but it is in fact "RMSD" and the units are Angstroms.

Examples

fetch 1oky 1t46, async=0

# 1) default with outlier rejection
align 1oky, 1t46

# 2) with alignment object, save to clustalw file
align 1oky, 1t46, object=alnobj
save alignment.aln, alnobj

# 3) all-atom RMSD (no outlier rejection) and without superposition
align 1oky, 1t46, cycles=0, transform=0

PyMOL API

cmd.align( string mobile, string target, float cutoff=2.0,
           int cycles=5, float gap=-10.0, float extend=-0.5,
           int max_gap=50, string object=None, string matrix='BLOSUM62',
           int mobile_state=0, int target_state=0, int quiet=1,
           int max_skip=0, int transform=1, int reset=0 )

This returns a list with 7 items:

  1. RMSD after refinement
  2. Number of aligned atoms after refinement
  3. Number of refinement cycles
  4. RMSD before refinement
  5. Number of aligned atoms before refinement
  6. Raw alignment score
  7. Number of residues aligned

Notes

  • The molecules you want to align need to be in two different objects. Else, PyMOL will answer with: ExecutiveAlign: invalid selections for alignment. You can skirt this problem by making a temporary object and aligning your original to the copy.
  • By defaults, all states (like in NMR structures or trajectories) are considered, this might yield a bad or suboptimal alignment for a single state. Use the mobile_state and target_state argument to be explicit in such cases.

See Also