Ccp4 ncont: Difference between revisions

From PyMOLWiki
Jump to navigation Jump to search
m (moved ContactsNCONT to SelectNCONTContacts: unify naming scheme for new scripts)
(updated script with faster selection and naming scheme.)
Line 1: Line 1:
[[File:HhaExample.png|thumb|300px|right|Interface residues (at cutoff <4A) in the 2c7r.pdb were found using NCONT. Usage of ContactsNCONT script in PyMOL allows easy selection of residues and atoms listed in ncont.log file. Interacting protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.]]
[[File:HhaExample.png|thumb|300px|right|Interface residues (at cutoff <4A) in the 2c7r.pdb were found using NCONT. Usage of selectNCONTContacts script in PyMOL allows easy selection of residues and atoms listed in ncont.log file. Interacting protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.]]


== Overview ==
== Overview ==


The script selects residues and atoms from the list of the contacts found by NCONT from CCP4 Program Suite (NCONT analyses contacts between subsets of atoms in a PDB file).
The script selects residues and atoms from the list of the contacts found by NCONT from CCP4 Program Suite (NCONT analyses contacts between subsets of atoms in a PDB file).
First, we run NCONT on our pdb file to find interface residues. Then by using the ContactsNCONT script in PyMOL we separately select residues and atoms listed in a ncont.log file. This generates two selections (atoms and residues) for each interacting chain, allowing quick manipulation of (sometimes) extensive lists in NCONT log file.
First, we run NCONT on our pdb file to find interface residues. Then by using the selectNCONTContacts script in PyMOL we separately select residues and atoms listed in a ncont.log file. This generates two selections (atoms and residues) for each interacting chain, allowing quick manipulation of (sometimes) extensive lists in NCONT log file.


This script works best for intermolecular contacts (when NCONT target and source selections don't overlap). If crystal contacts (NCONT parameter cell = 1 or 2) are included then additional coding is required to distinguish inter from intramolecular contacts.
This script works best for intermolecular contacts (when NCONT target and source selections don't overlap). If crystal contacts (NCONT parameter cell = 1 or 2) are included then additional coding is required to distinguish inter from intramolecular contacts.
Line 10: Line 10:
== Usage ==
== Usage ==


selectContacts( contactsfile, selName1 = "source", selName2 = "target" )
selectNCONTContacts( contactsfile, selName1 = "source", selName2 = "target" )




Line 17: Line 17:


First use NCONT to find interface residues/atoms in the pdb file. Once you have ncont.log file proceed to PyMOL.
First use NCONT to find interface residues/atoms in the pdb file. Once you have ncont.log file proceed to PyMOL.
Make sure you've run the ContactsNCONT script first.
Make sure you've run the selectNCONTContacts script first.
   
   
  fetch 2c7r
  fetch 2c7r
  selectContacts ncont.log, selName1=prot, selName2=dna
  selectNCONTContacts ncont.log, selName1=prot, selName2=dna


[[File:HhaI20example.png|thumb|300px|right|Quick and easy selection of interacting residues and atoms listed in the NCONT log file. Protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.]]
[[File:HhaI20example.png|thumb|300px|right|Quick and easy selection of interacting residues and atoms listed in the NCONT log file. Protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.]]
Line 27: Line 27:
<source lang="python">
<source lang="python">
import re
import re
 
def parseContacts( f ):
def parseNCONTContacts( f ):
     # /1/B/ 282(PHE). / CE1[ C]:  /1/E/ 706(GLN). / O  [ O]:  3.32
     # /1/B/ 282(PHE). / CE1[ C]:  /1/E/ 706(GLN). / O  [ O]:  3.32
    # conParser = re.compile("\s*/(\d+)/([A-Z])/\s*(\d+).*?/\s*([A-Z0-9]*).*?:")
     conParser = re.compile("\s*/(\d+)/([A-Z]*)/\s*(\d+).*?/\s*([A-Z0-9]*).*?:") # * in the second group is needed when chain code is blank
     conParser = re.compile("\s*/(\d+)/([A-Z]*)/\s*(\d+).*?/\s*([A-Z0-9]*).*?:") # * is needed when chain code is blank
     mode = 0
     mode = 0
     s1 = []
     s1 = []
Line 54: Line 53:
         else:
         else:
             print "Unknown mode", mode
             print "Unknown mode", mode
 
def selectContacts( contactsfile, selName1 = "source", selName2 = "target" ):
def selectNCONTContacts( contactsfile, selName1 = "source", selName2 = "target" ):
     """
     """
     selectContacts -- parses CCP4 NCONT log file and selects residues and atoms from the list of the contacts found.
     selectContacts -- parses CCP4/NCONT log file and selects residues and atoms.
    http://www.ccp4.ac.uk/html/ncont.html
   
     PARAMS
     PARAMS
         contactsfile
         contactsfile
             filename of the CCP4 NCONT contacts log file
             filename of the CCP4/NCONT contacts log file
 
         selName1
         selName1
             the name prefix for the _res and _atom selections returned for the
             the name prefix for the _res and _atom selections returned for the
             source set of chain
             source set of chain
 
         selName2
         selName2
             the name prefix for the _res and _atom selections returned for the  
             the name prefix for the _res and _atom selections returned for the  
             target set of chain
             target set of chain
 
     RETURNS
     RETURNS
         * 2 selections of interface residues and atoms for each chain are created and named
         4 selections of interface residues and atoms are created and named
            depending on what you passed into selName1 and selName2
        depending on what you passed into selName1 and selName2
 
     AUTHOR:
    REPOSITORY
         Gerhard Reitmayr and Dalia Daujotyte, 2009.      
        https://github.com/GerhardR/pymol-scripts
 
     AUTHOR
         Gerhard Reitmayr and Dalia Daujotyte, 2009.
     """
     """
     # read and parse contacts file into two lists of contact atoms and contact pair list
     # read and parse contacts file into two lists of contact atoms and contact pair list
     s1, s2, pairs = parseContacts(open(contactsfile))
     s1, s2, pairs = parseNCONTContacts(open(contactsfile))
     # create a selection for the first contact list
     # create a selection for the first contact list
     resName = selName1 + "_res"
      
     atomName = selName1 + "_atom"
    # create the PYMOL selection macros for the residues
     cmd.select(resName, None)
    resNames = [chain+"/"+residue+"/" for (type, chain, residue, atom) in s1]
     cmd.select(atomName, None)
     # put them in a set to remove duplicates and then join with 'or'
     for (thing, chain, residue, atom) in s1:
    resSel = " or ".join(frozenset(resNames))
        cmd.select( resName, resName + " or " + chain+"/"+residue+"/")
     # finally select them under the new name
        cmd.select( atomName, atomName + " or " + chain+"/"+residue+"/"+atom)
     cmd.select(selName1 + "_res", resSel)
      
    atomNames = [chain+"/"+residue+"/"+atom for (type, chain, residue, atom) in s1]
    atomSel = " or ".join(frozenset(atomNames))
    cmd.select(selName1 + "_atom", atomSel)
 
     # create a selection for the second contact list
     # create a selection for the second contact list
     resName = selName2 + "_res"
 
     atomName = selName2 + "_atom"
     resNames = [chain+"/"+residue+"/" for (type, chain, residue, atom) in s2]
    cmd.select(resName, None)
     resSel = " or ".join(frozenset(resNames))
     cmd.select(atomName, None)
     cmd.select(selName2 + "_res", resSel)
     for (thing, chain, residue, atom) in s2:
      
        cmd.select( resName, resName + " or " + chain+"/"+residue+"/")
    atomNames = [chain+"/"+residue+"/"+atom for (type, chain, residue, atom) in s2]
        cmd.select( atomName, atomName + " or " + chain+"/"+residue+"/"+atom)
    atomSel = " or ".join(frozenset(atomNames))
    cmd.select(selName2 + "_atom", atomSel)
cmd.extend("selectContacts", selectContacts)
 
cmd.extend("selectNCONTContacts", selectNCONTContacts)
</source>
</source>
== Code repository ==
The latest version of this script and related scripts is available at https://github.com/GerhardR/pymol-scripts.


[[Category:Script_Library]] [[Category:ThirdParty Scripts]] [[Category:Structural Biology Scripts]]
[[Category:Script_Library]] [[Category:ThirdParty Scripts]] [[Category:Structural Biology Scripts]]

Revision as of 15:19, 19 June 2011

Interface residues (at cutoff <4A) in the 2c7r.pdb were found using NCONT. Usage of selectNCONTContacts script in PyMOL allows easy selection of residues and atoms listed in ncont.log file. Interacting protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.

Overview

The script selects residues and atoms from the list of the contacts found by NCONT from CCP4 Program Suite (NCONT analyses contacts between subsets of atoms in a PDB file). First, we run NCONT on our pdb file to find interface residues. Then by using the selectNCONTContacts script in PyMOL we separately select residues and atoms listed in a ncont.log file. This generates two selections (atoms and residues) for each interacting chain, allowing quick manipulation of (sometimes) extensive lists in NCONT log file.

This script works best for intermolecular contacts (when NCONT target and source selections don't overlap). If crystal contacts (NCONT parameter cell = 1 or 2) are included then additional coding is required to distinguish inter from intramolecular contacts.

Usage

selectNCONTContacts( contactsfile, selName1 = "source", selName2 = "target" )


Examples

First use NCONT to find interface residues/atoms in the pdb file. Once you have ncont.log file proceed to PyMOL. Make sure you've run the selectNCONTContacts script first.

fetch 2c7r
selectNCONTContacts ncont.log, selName1=prot, selName2=dna
Quick and easy selection of interacting residues and atoms listed in the NCONT log file. Protein and DNA residues are colored in red and slate, respectively. Atoms in contact are shown in dots.

The Code

import re

def parseNCONTContacts( f ):
    # /1/B/ 282(PHE). / CE1[ C]:  /1/E/ 706(GLN). / O  [ O]:   3.32
    conParser = re.compile("\s*/(\d+)/([A-Z]*)/\s*(\d+).*?/\s*([A-Z0-9]*).*?:") # * in the second group is needed when chain code is blank
    mode = 0
    s1 = []
    s2 = []
    pairs = []
    for line in f:
        if mode == 0:
            if line.strip().startswith("SOURCE ATOMS"):
                mode = 1
        elif mode == 1:
            mode = 2
        elif mode == 2:
            matches = conParser.findall(line)
            if len(matches) == 0:
                return (s1, s2, pairs)
            if len(matches) == 2:
                s1.append(matches[0])
                s2.append(matches[1])
            elif len(matches) == 1:
                s2.append(matches[0])
            pairs.append((len(s1)-1, len(s2)-1))
        else:
            print "Unknown mode", mode

def selectNCONTContacts( contactsfile, selName1 = "source", selName2 = "target" ):
    """
    selectContacts -- parses CCP4/NCONT log file and selects residues and atoms.
    http://www.ccp4.ac.uk/html/ncont.html
    
    PARAMS
        contactsfile
            filename of the CCP4/NCONT contacts log file

        selName1
            the name prefix for the _res and _atom selections returned for the
            source set of chain

        selName2
            the name prefix for the _res and _atom selections returned for the 
            target set of chain

    RETURNS
        4 selections of interface residues and atoms are created and named
        depending on what you passed into selName1 and selName2

    REPOSITORY
        https://github.com/GerhardR/pymol-scripts

    AUTHOR
        Gerhard Reitmayr and Dalia Daujotyte, 2009.
    """
    # read and parse contacts file into two lists of contact atoms and contact pair list
    s1, s2, pairs = parseNCONTContacts(open(contactsfile))
    # create a selection for the first contact list
    
    # create the PYMOL selection macros for the residues 
    resNames = [chain+"/"+residue+"/" for (type, chain, residue, atom) in s1]
    # put them in a set to remove duplicates and then join with 'or'
    resSel = " or ".join(frozenset(resNames))
    # finally select them under the new name
    cmd.select(selName1 + "_res", resSel)
    
    atomNames = [chain+"/"+residue+"/"+atom for (type, chain, residue, atom) in s1]
    atomSel = " or ".join(frozenset(atomNames))
    cmd.select(selName1 + "_atom", atomSel)

    # create a selection for the second contact list

    resNames = [chain+"/"+residue+"/" for (type, chain, residue, atom) in s2]
    resSel = " or ".join(frozenset(resNames))
    cmd.select(selName2 + "_res", resSel)
    
    atomNames = [chain+"/"+residue+"/"+atom for (type, chain, residue, atom) in s2]
    atomSel = " or ".join(frozenset(atomNames))
    cmd.select(selName2 + "_atom", atomSel)

cmd.extend("selectNCONTContacts", selectNCONTContacts)

Code repository

The latest version of this script and related scripts is available at https://github.com/GerhardR/pymol-scripts.