Praktikum Biophysikalische Chemie: Difference between revisions

From CYANA Wiki
Jump to navigation Jump to search
 
(24 intermediate revisions by one other user not shown)
Line 1: Line 1:
=== Startup ===


=== Download and install the course material ===
Log in to the Windows XP system with the username and password of your HRZ account.


Store the file [[PBCPraktikumCyana.tgz]] in your home directory and execute the setup command:
The structure calculation will be performed on a Linux server that can be accessed by clicking the icon labeled '''BCC Linux blade41.nxs'''. The username and password for the Linux server are the same as for the Windows XP computer, and the data in the home directories are shared between the two systems.


  tar zxf PBCPraktikumCyana.tgz
In a terminal window, start the bash shell and execute the setup command
  setup_cyana


Make a new directory for the structure calculation, and change into it:
bash
. /usr/local/courseexchange/cyana_setup.sh


  mkdir vaso
Create a new directory, '''vaso''', for the structure calculation, and change into it:
  cd vaso
 
mkdir vaso
cd vaso


=== Write the sequence file ===
=== Write the sequence file ===
Line 16: Line 19:
Use a text editor to write a new file called '''vaso.seq''' that contains the peptide sequence, one upper-case residue name per line, given in the standard three-letter code for amino acids (except for cysteine residues that are involved in a disulfide bond, which are denoted by "CYSS"), e.g.
Use a text editor to write a new file called '''vaso.seq''' that contains the peptide sequence, one upper-case residue name per line, given in the standard three-letter code for amino acids (except for cysteine residues that are involved in a disulfide bond, which are denoted by "CYSS"), e.g.


  CYSS
CYSS
  TYR
TYR
  PHE
PHE
  GLN
GLN
  ASN
ASN
  CYSS
CYSS
  PRO
PRO
  ARG
ARG
  GLY
GLY


=== Write the initialization script ===
=== Write the initialization script ===
Line 30: Line 33:
Use a text editor to write a new initialization script, '''init.cya''', for the program CYANA with the following content:
Use a text editor to write a new initialization script, '''init.cya''', for the program CYANA with the following content:


  cyanalib
cyanalib
  read seq vaso.seq  
read seq vaso.seq  
library rename H atom=HN
rmsdrange:=1-6


These two commands will be executed automatically whenever the program CYANA is started. The '''cyanalib''' command reads the standard residue library of CYANA, and the command '''read seq vaso.seq''' reads the polypeptide sequence.
These two commands will be executed automatically whenever the program CYANA is started. The '''cyanalib''' command reads the standard residue library of CYANA, and the command '''read seq vaso.seq''' reads the polypeptide sequence. The command '''library rename H atom=HN''' changes the name of the backbone hydrogen atom from "H" to "HN". The variable '''rmsdrange''' is set to the preferred residue range for RMSD calculation.


=== Write the NOE distance restraint file ===
=== Write the NOE distance restraint file ===
Line 65: Line 70:


Residue and atom names are given in upper case letters. The exact number of spaces between different items is irrelevant, but the "TAB" key should not be used.  
Residue and atom names are given in upper case letters. The exact number of spaces between different items is irrelevant, but the "TAB" key should not be used.  
Note that the backbone amide hydrogen is called "H", not "HN".


Degenerate groups of atoms, e.g. methyl groups, and diastereotopic pairs of hydrogen atoms, e.g. HB2/HB3 in serine, are referred to by "pseudoatoms" whose names are derived from the names of the hydrogen atoms that they represent by changing the first letter from "H" to "Q" and omitting the last digit. For instance, "HB2" and "HB3" are represented by a pseudoatom called "QB".
Degenerate groups of atoms, e.g. methyl groups, and diastereotopic pairs of hydrogen atoms, e.g. HB2/HB3 in serine, are referred to by "pseudoatoms" whose names are derived from the names of the hydrogen atoms that they represent by changing the first letter from "H" to "Q" and omitting the last digit. For instance, "HB2" and "HB3" are represented by a pseudoatom called "QB".
Line 74: Line 77:
Use a text editor to write a new CYANA script, '''CALC.cya''', with the following content:
Use a text editor to write a new CYANA script, '''CALC.cya''', with the following content:


  read upl vaso.upl
read upl vaso.upl
  ssbond 1-6
ssbond 1-6
  calc_all 50 steps=3000
calc_all 50 steps=3000
  overview vaso.ovw structures=10 pdb
overview vaso.ovw structures=10 pdb


The '''read upl''' command reads the input file with upper distance limits, vaso.upl.  
The '''read upl''' command reads the input file with upper distance limits, vaso.upl.  
Line 91: Line 94:
Start CYANA, and execute the CYANA script CALC.cya:
Start CYANA, and execute the CYANA script CALC.cya:


  > cyana
> cyana
  ___________________________________________________________________
___________________________________________________________________
   
 
  CYANA 3.0 (intel)
  CYANA 3.0 (intel)
 
 
  Copyright (c) 2002-08 Peter Guntert. All rights reserved.
Copyright (c) 2002-08 Peter Guntert. All rights reserved.
  ___________________________________________________________________
___________________________________________________________________
   
   
      Library file "/usr/local/soft/cyana-3.0/lib/cyana.lib" read, 38 residue types.
    Library file "/usr/local/soft/cyana-3.0/lib/cyana.lib" read, 38 residue types.
      Sequence file "demo.seq" read, 114 residues.
    Sequence file "demo.seq" read, 114 residues.
  cyana> CALC
cyana> CALC


=== Analyze the results of the structure calculation ===
=== Analyze the results of the structure calculation ===
Line 107: Line 110:
The results of the structure calculation are the structural statistics in the overview file, vaso.ovw, and the structure itself, which is represented by a bundle of 10 conformers whose coordinates are stored in the PDB file, vaso.pdb.
The results of the structure calculation are the structural statistics in the overview file, vaso.ovw, and the structure itself, which is represented by a bundle of 10 conformers whose coordinates are stored in the PDB file, vaso.pdb.


The overview file, vaso.ovw, has three parts. The file starts with a table of the target function values and restraint violation statistics:  
The overview file, vaso.ovw, has three parts. The file starts with a table of the target function values and restraint violation statistics. For example:  


    Structural statistics:
Structural statistics:
   
   
    str  target    upper limits    van der Waals  torsion angles
str  target    upper limits    van der Waals  torsion angles
        function  #    rms  max  #    sum  max  #    rms  max
    function  #    rms  max  #    sum  max  #    rms  max
      1    1.69  2 0.0076  0.36  4    5.6  0.34  0 0.3302  3.23
  1    1.69  2 0.0076  0.36  4    5.6  0.34  0 0.3302  3.23
      2    1.74  2 0.0077  0.36  5    5.9  0.34  0 0.3272  3.30
  2    1.74  2 0.0077  0.36  5    5.9  0.34  0 0.3272  3.30
      3    1.75  1 0.0075  0.36  5    5.7  0.34  0 0.3695  3.45
  3    1.75  1 0.0075  0.36  5    5.7  0.34  0 0.3695  3.45
      4    1.87  1 0.0075  0.37  7    6.3  0.34  0 0.3159  2.69
  4    1.87  1 0.0075  0.37  7    6.3  0.34  0 0.3159  2.69
      5    1.95  1 0.0075  0.37  5    6.7  0.37  0 0.3185  3.00
  5    1.95  1 0.0075  0.37  5    6.7  0.37  0 0.3185  3.00
      6    2.12  2 0.0084  0.36  6    6.6  0.34  0 0.3745  3.56
  6    2.12  2 0.0084  0.36  6    6.6  0.34  0 0.3745  3.56
      7    2.19  2 0.0100  0.50  7    6.8  0.34  0 0.3257  3.29
  7    2.19  2 0.0100  0.50  7    6.8  0.34  0 0.3257  3.29
      8    2.35  2 0.0096  0.36  8    7.0  0.34  0 0.3748  3.50
  8    2.35  2 0.0096  0.36  8    7.0  0.34  0 0.3748  3.50
      9    2.40  2 0.0088  0.36  5    8.4  0.35  0 0.4152  3.44
  9    2.40  2 0.0088  0.36  5    8.4  0.35  0 0.4152  3.44
    10    2.49  2 0.0090  0.36  9    7.6  0.33  0 0.3494  3.20
  10    2.49  2 0.0090  0.36  9    7.6  0.33  0 0.3494  3.20
   
   
    Ave    2.06  2 0.0084  0.38  6    6.7  0.34  0 0.3501  3.26
Ave    2.06  2 0.0084  0.38  6    6.7  0.34  0 0.3501  3.26
    +/-    0.28  0 0.0009  0.04  2    0.8  0.01  0 0.0308  0.25
+/-    0.28  0 0.0009  0.04  2    0.8  0.01  0 0.0308  0.25
    Min    1.69  1 0.0075  0.36  4    5.6  0.33  0 0.3159  2.69
Min    1.69  1 0.0075  0.36  4    5.6  0.33  0 0.3159  2.69
    Max    2.49  2 0.0100  0.50  9    8.4  0.37  0 0.4152  3.56
Max    2.49  2 0.0100  0.50  9    8.4  0.37  0 0.4152  3.56
    Cut                      0.20            0.20            5.00
Cut                      0.20            0.20            5.00


This table has one row for each structure, containing  
This table has one row for each structure, containing  
Line 134: Line 137:
* the rank of the structure sorted by target function value
* the rank of the structure sorted by target function value
* the target function value  
* the target function value  
* three columns for each type of conformational restraints that is present:  
* three columns for each type of conformational restraints:  
** the number of restraints that are violated by more than the cutoff value given in the last row (“Cut”)
** the number of restraints that are violated by more than the cutoff value given in the last row (“Cut”)
** the root-mean-square (RMS) violation calculated over all, violated and fulfilled, restraints of this type
** the root-mean-square (RMS) violation calculated over all, violated and fulfilled, restraints of this type
Line 142: Line 145:
Restraints that are violated in a significant number of structures by more than the corresponding cutoff value are reported in the second part of the overview file:
Restraints that are violated in a significant number of structures by more than the corresponding cutoff value are reported in the second part of the overview file:
   
   
    Constraints violated in 3 or more structures:
Constraints violated in 3 or more structures:
                                                  #  mean  max.  1  5  10
                                                #  mean  max.  1  5  10
    Upper QB    LEU  17 - QB    PRO  108  3.69  3  0.10  0.50  ++    *    peak 1009
Upper QB    LEU  17 - QB    PRO  108  3.69  3  0.10  0.50  ++    *    peak 1009
    Upper HB    ILE  85 - H    ASP  86  3.80  10  0.36  0.37  +++*++++++  peak 803
Upper HB    ILE  85 - H    ASP  86  3.80  10  0.36  0.37  +++*++++++  peak 803
    VdW  N    LEU  39 - CD1  LEU  39  3.05  10  0.23  0.24  ++++*+++++
  VdW  N    ILE  81 - HD2  PRO  82  2.45  10  0.34  0.37  ++++*+++++
    VdW  N    ILE  81 - CD    PRO  82  3.05 10  0.26  0.29  ++++*+++++
VdW  CG2  ILE  81 - C    ILE  81  2.90  6  0.20  0.21  + + ++* +
    VdW  N    ILE  81 - HD2  PRO  82  2.45  10  0.34  0.37  ++++*+++++
  2 violated distance restraints.
    VdW  CG2  ILE  81 - C    ILE  81  2.90  6  0.20  0.21  + + ++* +
0 violated angle restraints.
    VdW  CB    THR  91 - H    GLN  92  2.55  7  0.22  0.28 ++*+ +++
    VdW  O    THR  91 - CB    GLN  92  2.90  3  0.16  0.20    *+  +
    VdW  HA    VAL  107 - CD    PRO  108  2.60  5  0.19  0.27    +* ++ +
    2 violated distance restraints.
    0 violated angle restraints.


Each line identifies a violated restraint, and gives the number of structures in which the restraint is violated by more than the aforementioned cutoff value (column labeled “#”), its maximal violation (column “max.”), and the structures in which the violations occur (a one-character column for each structure that is analyzed). Structures in which the restraint is violated by more than the cutoff are marked with “+”, or with a “*” for the structure in which the maximal violation occurs. If available, the number of the cross peak from which the restraint originated is given at the end of the line.  
Each line identifies a violated restraint, and gives the number of structures in which the restraint is violated by more than the aforementioned cutoff value (column labeled “#”), its maximal violation (column “max.”), and the structures in which the violations occur (a one-character column for each structure that is analyzed). Structures in which the restraint is violated by more than the cutoff are marked with “+”, or with a “*” for the structure in which the maximal violation occurs. If available, the number of the cross peak from which the restraint originated is given at the end of the line.  
Line 160: Line 158:
At the end of the overview file, root-mean-square deviation (RMSD) values for the atom positions after optimal superposition of the individual conformers onto the mean coordinates are given:
At the end of the overview file, root-mean-square deviation (RMSD) values for the atom positions after optimal superposition of the individual conformers onto the mean coordinates are given:
   
   
    RMSDs for residues 10..100:
RMSDs for residues 10..100:
    Average backbone RMSD to mean  :    0.58 +/- 0.11 A (0.45..0.82 A; 10 structures)
Average backbone RMSD to mean  :    0.58 +/- 0.11 A (0.45..0.82 A; 10 structures)
    Average heavy atom RMSD to mean :    1.08 +/- 0.11 A (0.93..1.25 A; 10 structures)
Average heavy atom RMSD to mean :    1.08 +/- 0.11 A (0.93..1.25 A; 10 structures)


The residue range used for the superposition is indicated, and RMSD values are computed for the backbone and heavy atoms with the rmsd command. The average value, the standard deviation, and the minimal and maximal values of the RMSDs between the analyzed structures and their mean coordinates are calculated.
The residue range used for the superposition is indicated. RMSD values are computed for the backbone and heavy atoms of the given residues. The average value, the standard deviation, and the minimal and maximal values of the RMSDs between the analyzed structures and their mean coordinates are calculated.


=== Visualize the structure ===
=== Visualize the structure ===


The program MOLMOL can be used to visualize the bundle of conformers that represents the solution structure of the peptide. Use the command
The program MOLMOL can visualize the bundle of conformers that represents the solution structure of the peptide. Use the command
 
molmol -r 1-6 vaso.pdb
 
to start the program MOLMOL and to show a superposition of the 10 conformers whose coordinates are stored in the PDB file vaso.pdb. The option "-r 1-6" indicates MOLMOL to optimally superimpose the backbone atoms of residues 1-6.


  molmol -r 1-6 vaso.pdb
=== Printing ===


to start the program MOLMOL and to show a superposition of the 10 conformers whose coordinates are stored in the PDB file vaso.pdb. The option "-r 1-6" indicates MOLMOL to optimallt superimpose the backbone atoms of residues 1-6.
A local printer can be accessed with the name '''bpc_lokal''' from Linux, or '''BPC HP Laser Jet 4250 PS''' from Windows XP.

Latest revision as of 09:33, 13 January 2009

Startup

Log in to the Windows XP system with the username and password of your HRZ account.

The structure calculation will be performed on a Linux server that can be accessed by clicking the icon labeled BCC Linux blade41.nxs. The username and password for the Linux server are the same as for the Windows XP computer, and the data in the home directories are shared between the two systems.

In a terminal window, start the bash shell and execute the setup command

bash
. /usr/local/courseexchange/cyana_setup.sh

Create a new directory, vaso, for the structure calculation, and change into it:

mkdir vaso
cd vaso

Write the sequence file

Use a text editor to write a new file called vaso.seq that contains the peptide sequence, one upper-case residue name per line, given in the standard three-letter code for amino acids (except for cysteine residues that are involved in a disulfide bond, which are denoted by "CYSS"), e.g.

CYSS
TYR
PHE
GLN
ASN
CYSS
PRO
ARG
GLY

Write the initialization script

Use a text editor to write a new initialization script, init.cya, for the program CYANA with the following content:

cyanalib
read seq vaso.seq 
library rename H atom=HN
rmsdrange:=1-6

These two commands will be executed automatically whenever the program CYANA is started. The cyanalib command reads the standard residue library of CYANA, and the command read seq vaso.seq reads the polypeptide sequence. The command library rename H atom=HN changes the name of the backbone hydrogen atom from "H" to "HN". The variable rmsdrange is set to the preferred residue range for RMSD calculation.

Write the NOE distance restraint file

Use a text editor to write a new file, vaso.upl, that contains the upper distance bounds derived from NOESY cross peaks, using the standard CYANA nomenclature for atoms in proteins and the same format as in the following example:

91 THR  HB     93 GLN  QB      5.50
80 SER  HB2    81 ILE  H       4.22
80 SER  HB3    81 ILE  H       4.22
81 ILE  HA     84 LEU  H       4.01
81 ILE  HA     84 LEU  HB2     4.47
81 ILE  HA     81 ILE  QG2     3.46
81 ILE  HA     81 ILE  HG12    3.77
28 VAL  HA     39 LEU  HG      3.97
52 SER  H      52 SER  HB2     3.96
52 SER  H      52 SER  HB3     3.96
99 SER  QB    101 VAL  H       5.50
43 SER  H      43 SER  QB      3.12
43 SER  QB     48 GLU  H       4.07
42 GLU  HA     43 SER  QB      5.50
43 SER  QB     48 GLU  HB2     3.95

Each line specifies an upper bound on the distance between two hydrogen atoms. The data in the 7 columns are:

  1. First residue number
  2. First residue name
  3. First atom name
  4. Second residue number
  5. Second residue name
  6. Second atom name
  7. Upper distance bound in Å

Residue and atom names are given in upper case letters. The exact number of spaces between different items is irrelevant, but the "TAB" key should not be used.

Degenerate groups of atoms, e.g. methyl groups, and diastereotopic pairs of hydrogen atoms, e.g. HB2/HB3 in serine, are referred to by "pseudoatoms" whose names are derived from the names of the hydrogen atoms that they represent by changing the first letter from "H" to "Q" and omitting the last digit. For instance, "HB2" and "HB3" are represented by a pseudoatom called "QB".

Write the CYANA script to execute the structure calculation

Use a text editor to write a new CYANA script, CALC.cya, with the following content:

read upl vaso.upl
ssbond 1-6
calc_all 50 steps=3000
overview vaso.ovw structures=10 pdb

The read upl command reads the input file with upper distance limits, vaso.upl.

The ssbond command adds restraints for the disulfide bond between residues Cys1 and Cys6.

The calc_all command performs a structure calculation starting from 50 conformers with random torsion angle values. Simulated annealing with 3000 torsion angle dynamics steps per conformer is used.

The overview command sorts the resulting structures by ascending target function value, analyzes the 10 best conformers for violations of the conformational restraints, and saves the results of the analysis in an overview file, vaso.ovw, and the coordinates of the 10 best conformers in a PDB file, vaso.pdb.

Run the CYANA structure calculation

Start CYANA, and execute the CYANA script CALC.cya:

> cyana
___________________________________________________________________
CYANA 3.0 (intel)
 
Copyright (c) 2002-08 Peter Guntert. All rights reserved.
___________________________________________________________________

    Library file "/usr/local/soft/cyana-3.0/lib/cyana.lib" read, 38 residue types.
    Sequence file "demo.seq" read, 114 residues.
cyana> CALC

Analyze the results of the structure calculation

The results of the structure calculation are the structural statistics in the overview file, vaso.ovw, and the structure itself, which is represented by a bundle of 10 conformers whose coordinates are stored in the PDB file, vaso.pdb.

The overview file, vaso.ovw, has three parts. The file starts with a table of the target function values and restraint violation statistics. For example:

Structural statistics:

str   target     upper limits    van der Waals   torsion angles
    function   #    rms   max   #    sum   max   #    rms   max
  1     1.69   2 0.0076  0.36   4    5.6  0.34   0 0.3302  3.23
  2     1.74   2 0.0077  0.36   5    5.9  0.34   0 0.3272  3.30
  3     1.75   1 0.0075  0.36   5    5.7  0.34   0 0.3695  3.45
  4     1.87   1 0.0075  0.37   7    6.3  0.34   0 0.3159  2.69
  5     1.95   1 0.0075  0.37   5    6.7  0.37   0 0.3185  3.00
  6     2.12   2 0.0084  0.36   6    6.6  0.34   0 0.3745  3.56
  7     2.19   2 0.0100  0.50   7    6.8  0.34   0 0.3257  3.29
  8     2.35   2 0.0096  0.36   8    7.0  0.34   0 0.3748  3.50
  9     2.40   2 0.0088  0.36   5    8.4  0.35   0 0.4152  3.44
 10     2.49   2 0.0090  0.36   9    7.6  0.33   0 0.3494  3.20

Ave     2.06   2 0.0084  0.38   6    6.7  0.34   0 0.3501  3.26
+/-     0.28   0 0.0009  0.04   2    0.8  0.01   0 0.0308  0.25
Min     1.69   1 0.0075  0.36   4    5.6  0.33   0 0.3159  2.69
Max     2.49   2 0.0100  0.50   9    8.4  0.37   0 0.4152  3.56
Cut                      0.20             0.20             5.00

This table has one row for each structure, containing

  • the rank of the structure sorted by target function value
  • the target function value
  • three columns for each type of conformational restraints:
    • the number of restraints that are violated by more than the cutoff value given in the last row (“Cut”)
    • the root-mean-square (RMS) violation calculated over all, violated and fulfilled, restraints of this type
    • the maximal violation

The five bottom lines of the Table give the average value, the standard deviation, the minimum value, and the maximum value of the corresponding quantity over the individual structures, as well as the cutoff value for significant violations.

Restraints that are violated in a significant number of structures by more than the corresponding cutoff value are reported in the second part of the overview file:

Constraints violated in 3 or more structures:
                                               #   mean   max.  1   5   10
Upper QB    LEU   17 - QB    PRO  108   3.69   3   0.10   0.50  ++    *     peak 1009
Upper HB    ILE   85 - H     ASP   86   3.80  10   0.36   0.37  +++*++++++  peak 803
VdW   N     ILE   81 - HD2   PRO   82   2.45  10   0.34   0.37  ++++*+++++
VdW   CG2   ILE   81 - C     ILE   81   2.90   6   0.20   0.21   + + ++* +
2 violated distance restraints.
0 violated angle restraints.

Each line identifies a violated restraint, and gives the number of structures in which the restraint is violated by more than the aforementioned cutoff value (column labeled “#”), its maximal violation (column “max.”), and the structures in which the violations occur (a one-character column for each structure that is analyzed). Structures in which the restraint is violated by more than the cutoff are marked with “+”, or with a “*” for the structure in which the maximal violation occurs. If available, the number of the cross peak from which the restraint originated is given at the end of the line.

At the end of the overview file, root-mean-square deviation (RMSD) values for the atom positions after optimal superposition of the individual conformers onto the mean coordinates are given:

RMSDs for residues 10..100:
Average backbone RMSD to mean   :    0.58 +/- 0.11 A (0.45..0.82 A; 10 structures)
Average heavy atom RMSD to mean :    1.08 +/- 0.11 A (0.93..1.25 A; 10 structures)

The residue range used for the superposition is indicated. RMSD values are computed for the backbone and heavy atoms of the given residues. The average value, the standard deviation, and the minimal and maximal values of the RMSDs between the analyzed structures and their mean coordinates are calculated.

Visualize the structure

The program MOLMOL can visualize the bundle of conformers that represents the solution structure of the peptide. Use the command

molmol -r 1-6 vaso.pdb

to start the program MOLMOL and to show a superposition of the 10 conformers whose coordinates are stored in the PDB file vaso.pdb. The option "-r 1-6" indicates MOLMOL to optimally superimpose the backbone atoms of residues 1-6.

Printing

A local printer can be accessed with the name bpc_lokal from Linux, or BPC HP Laser Jet 4250 PS from Windows XP.