ENORA and multi-state structure calculations

From CYANA Wiki
Jump to navigation Jump to search

In this tutorial we will provide you with a guided example for calculating eNOEs and a multi-state structure calculation.

To this end we will first run the modules of eNORA within CYANA and then use the obtained eNOEs to calculate a single state and a two-state structure model using automated sorting to separate the states. Along the way you will learn some additional CYANA skills useful for other purposes as well.

To finalize you will .... And ultimately you can try to improve ....


CYANA setup

Obtaining and installing the CYANA demo version and data

Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):

  1. Go to your home directory (or data directory).
  2. Get the demo data from the server.
  3. Unpack the input data for the practical.
  4. Get the demo version of CYANA.
  5. Unpack CYANA.
  6. Setup the CYANA environment variables.
  7. Change into the newly created directory 'eNORA'.
  8. Copy the demo_data directory to 'enoe'.
  9. Change into the subdirectory 'enoe'.
  10. Test whether CYANA can be started by typing its name, 'cyana'.
  11. Exit from CYANA by typing 'q' or 'quit'.
cd ~
wget 'http://www.cyana.org/wiki/images/6/64/eNORA_multiState.tar.gz'
tar zxf eNORA_multiState.tar.gz 

wget 'http://www.cyana.org/wiki/images/6/64/Cyana-3.98.9_Demo.tgz'
tar zxf Cyana-3.98.9_Demo.tgz
cd cyana-3.98.9/
./setup

cd ~

cd eNORA
cp -r demo_data enoe 
cd enoe


cyana
___________________________________________________________________

CYANA 3.98 (mac-intel)

Copyright (c) 2002-17 Peter Guentert. All rights reserved.
___________________________________________________________________

    Demo license valid for specific sequences until 2018-12-31
cyana> q

If all worked, you are ready to go in terms of everything related to CYANA!

If you want to return to your practical later, using your own Linux or Mac OS X computer, you can download the demo version of CYANA from [www.cyana.org/wiki/images/6/64/Cyana-3.98.9_Demo.tgz here].

Hint: More information on the CYANA commands etc. is in the CYANA 3.0 Reference Manual.

Execution scripts or "macros" in CYANA

For more complex task within CYANA, rather than to enter the execution commands line by line at the CYANA prompt, the necessary commands are collected in a file named '*.cya'. Collecting the commands in macros has the added advantage, that the macros serve as a record allowing to reconstruct previous calculations.

eNOE calculations

All eNOE related calculations within cyana are carried out using the eNORA modules.

NOESY experiment measured at different mixing times (keeping the mixing times as much as possible within the linear regime of NOE buildup) supply very precise distance restraints used for a structure calculation. In addition other restraints such as backbone angles from chemical shifts and scalar couplings for backbone and aromatic side chains are also used.

Experimental input data

Peak lists in XEASY format are prepared by automatic peak picking with a visualization program such as CcpNmr Analysis, NMRdraw or NMRview and saved as XXX.peaks, where XXX denotes the name of the xeasy peak list file. Since NMRdraw peak lists are of different file type, cyana provides the command read tab to convert the files to XEASY format.

# Number of dimensions 3
#FORMAT xeasy3D
#INAME 1 H
#INAME 2 HN
#INAME 3 N
#SPECTRUM N15NOESY H HN N
   17086    4.098    4.099   57.441 1 U   6.990943E+08  0.000000E+00 e 0  HA.5      HA.5      CA.5
   89532    4.355    1.829   33.507 1 U   1.720779E+06  0.000000E+00 e 0  HA.6      HB2.6     CB.6
   89544    4.353    1.757   33.513 1 U   2.939628E+06  0.000000E+00 e 0  HA.6      HB3.6     CB.6

The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' (no space between characters) lines gives the experiment type (N15NOESY, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (H HN N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. The experiment type and identifiers must correspond to an experiment definition in the general CYANA library (see below) in most uses of the definition, here however we cheat slightly and get away with it. We are cheating, because for eNOE calculations we record our NOESY spectra with simultanous evolution of 13C and 15N dimensions, since we require 15N and 13C bound spins within the same spectrum for purposes of normalization (see...).

After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has

  • Peak number 17086
  • H chemical shift 4.098 ppm
  • ("HN") chemical shift 4.099 ppm (in this case 13C bound)
  • Heavy atom chemical shift 57.441 ppm (in this case 13C labeled)

The other data are relevant entry for the eNOE mudules is the peak volume or intensity (6.990943E+08).

Assignment dimensions have to be arranged with the flow of magnetization, first row spin i, second row spin j


The protein sequence is supplied by three-letter code in the demo.seq file.

SPECTRUM definitions in the CYANA library

When you start CYANA, the program reads the library and displays the full path name of the library file. You can open the standard library file to inspect, for example, the NMR experiment definitions . For instance, the definition for the N15NOESY spectrum (search for 'N15NOESY' in the library file 'cyana.lib') is

SPECTRUM HNCA  HN N C
 0.980  HN:H_AMI  N:N_AM*  C:C_ALI  C_BYL
 0.800  HN:H_AMI  N:N_AMI  (C_ALI) C_BYL  C:C_ALI 

The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and identifies the atoms that are detected in each dimension of the spectrum. The number of identifiers defines the dimensionality of the experiment (3 in case of HNCA).

Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of HNCA there are two lines, corresponding to the intraresidual and sequential peak. For instance, the definition for the intraresidual peak starts with the probability to observe the peak (0.980), followed by a series of atom types, e.g. H_AMI for amide proton etc. An expected peak is generated for each molecular fragment in which these atom types occur connected by single covalent bonds. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for HNCA 'HN:', 'N:', and 'C:'.


Hint: For information on how to use the vi terminal editor: vi editor

eNORA

  • work in the copy of the data directory ('cd enoe')

Using the text editor of your choice, create your 'init.cya' macro as outlined below (The init macro) and also your 'CALC_enoe.cya' macro. Be extra careful to avoid typos and unwanted spaces in coma lists etc.

The init macro

The initialization macro file has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands that read the CYANA library and the protein sequence:

rmsdrange:=8-33
cyanalib
read seq demo.seq
swap=0
expand=1

The first line sets the appropriate rmsdrange, and the command 'cyanalib' reads the standard CYANA library. The next command reads the protein sequence.

The protein sequence is stored in three-letter code in the file 'demo.seq'.

Stereo-specificity of dia-stereocenters
atoms stereo "HA? 10"
atoms stereo "HB? 7 8 11 13 14 21 23 24 25 26 27 33 34 35 37 38"
atoms stereo "HG? 8 12 14 17 36 37"
atoms stereo "HD2? 18 26 30"
atoms stereo "QG? 22"
atoms stereo "QD? 7"
atoms stereo "HE2? 33"
atoms stereo "HD? 8 14 21 37"
atoms stereo "HG1? 28"

However, one may do the following to supply all atoms as stereo specific:

atoms select
atoms stereo

or to supply all atoms as non stereo specific, use:

atoms select
atoms stereo delete

To get a feedback of the supplied stereo specific assignment add to your 'init.cya' the command:

atoms stereo list
D20 exchange/labeling scheme
atoms set "H" protlev=0.97

The eNORA CALC macro

The 'CALC_enoe.cya' starts with the specification of the names of the input peak lists:


  • The input peak lists that will be used (as defined above).


# ----------------------------------  basic parameter definitions ---------------------------------- 
echo:= on
mixingtimes = '0.02,0.03,0.04,0.05,0.06'
protlevel   = 0.97
avgrho      = 5.3
tauc        = 4.25
b0field     = 700
maxdistance = 6.5
rhofile     = 'rhoInApo.rho'
normspin    = 2
bname 	    ='bupplots'
dname       ='decplots'

# ----------------------------------       structure input       ---------------------------------- 

# specify the conformers for calculations
read pdb demo rigid
#structure select 1-40
structure select 1

# ----------------------------------          peak input          ----------------------------------  
# read nmrDraw tab file
read tab demo.tab

# sort the peaks
peaks sort

# write out peak lists
do i 1 npkl
  peaks select "** list=${i}"
  write peaks $i.peaks names
end do

# ----------------------------------       run eNORA routine       ----------------------------------

# read in the peak lists
do i 1 npkl
  read peaks $i.peaks $if(i.eq.1,' ','append')
end do	

# supply averaged rho or izero values
if (existfile('$rhofile')) then
  read rho $rhofile
end if
 
# initialize the routine, fit experimental decays and buildups
enoe init normalize=normspin normed=1 rhoavg=avgrho time=$mixingtimes

# print average experimental auto-relaxation and I(0) values to screen
enoe avgExpVal

exit

# write the auto-relaxation values to file
write rhoOut.rho

# calculate the spin-diffusion correction
enoe spindiff b0field=b0field tauc=tauc maxdist=maxdistance mode=1 rmode=1

# apply spin-diffusion correction to experimental buildups (if calculated) and calculate reff
enoe reff b0field=b0field tauc=tauc

# prepare the cyana restraints (scaling, error margins)
enoe restraint errStereoFlag=1 errStereo=-1 chiN=-1 b0field=b0field tauc=tauc

# ----------------------------------    write restraints    ----------------------------------

# delete fixed distances from upl/lol output
distance delete fixed

# write the distance restraints to file
write upl enoe.upl
write lol enoe.lol

# ----------------------------------        plotting        ----------------------------------
enoe buildup b0field=b0field tauc=tauc plot=$bname
graf $bname.pdf

enoe decay plot=$dname
graf $dname.pdf

# ----------------------------------       overview      ----------------------------------
enoe overview

When you have prepared the 'init.cya' and the 'CALC_enoe.cya' try to run the macro.

To run the FLYA calculation, one could start CYANA and execute the 'CALC_enoe.cya' macro from the CYANA prompt, however since later you will preferably be working on a computer with multiple processors it is better get used to run the calculation by running the 'CALC_enoe.cya' as such:

cyana CALC_enoe.cya

The program if all is setup properly the program will run and display averages per spin type of autorelaxation and I (0) values and then stop. The command

exit

following the command 'enoe avgExpVal' stops the routine from running to completion. Before commenting out the exit command and running the routine to its end, do the following exercise.

Exercise x: Compiling the autorelaxation file

Before compiling the rhoIn.rho file, check the diagonal decays for their quality.

This is best done visually by copying the commands to plot the decay:

enoe decay plot=$dname
graf $dname.pdf

and add them right following the 'enoe avgExpVal' command.

When compiling the rhoIn.rho file, follow the instructions given here

eNORA output files

The FLYA algorithm will produce the following output files:

  • missRhoIzero: List of spins that lack a diagonal decay.
  • rhoOut.rho: Experimentally fitted rho and I(0) values values form diagonal decays.
  • enoe.upl and enoe.lol: Upper limit and lower limit restraint files with tolerances applied.
  • enoe.ovw: Collated results file.
  • bupplots:
  • decplots:

The missRhoIzero file

List of spins that lack a diagonal decay and therefore have no experimentally determined rho and I(0) values. The file is formatted to be used as basis for a rhoIn.rho file derived from average values. The file may also be used to setup generic normalized eNOE calcuations, see .....

The rhoOut.rho file

The enoe.upl/lol distance restrains

Tolerances Comments


The enoe.ovw file

Collated file that may be generated at any time during the routine and will be populated with the values available at the momentary progress of calculations.

  • ASSIGNMENT(i->j): Assignment arranged with the flow of magnetization.
  • REFFixEXP: The experimental sigma of spin x, where x=i,j
  • REFFxCORR: The reff of spin x, where x=i,j
  • SIGxEXP: The experimental sigma of spin x, where x=i,j
  • SIGxCORR: The spin-diffusion corrected experimental sigma of spin x, where x=i,j
  • SDCx: The spin-diffusion correction of spin x, where x=i,j
  • SDPx: The number of other partner spins involved in spin-diffusion for spin x, where x=i,j
  • IZEROx: The back calculated I(0) value of spin x, where x=i,j
  • exp_chiNx: The goodness of fit of the experimental data, where x=i,j
  • corr_chiN x: The goodness of fit of the spin-diffusion corrected experimental data, where x=i,j

Exercise xx: Mapping calculated eNOE restraints onto a known structure

One can map the calculated restraints, such as distance restraints (upl/lol) onto a known structure (in the example here an xray structure). This is an approach to analyze restraints and their influence on the results.

Below you find the commands to accomplish this. You see by studying the commands, which files are needed to execute the macro. Therefore, create a new directory ('mkdir') or copy a directory containing the respective files. Delete what you do not need. Use the regularized xray structure from exercise xxxx.

Commands preceded by hashtags (#) are commented out, remove the hashtags if you want to use them.

You need an init file:

rmsdrange:=8-33
cyanalib

And the main macro (name it 'CALC_xraymap.cya'):

read seq demo.seq


The following block of commands, reads the 'enoe.upl' and 'enoe.lol' restraints:

read upl enoe.upl
read lol enoe.lol
read regula.pdb unknown=warn

weight_vdw=0
overview enoe_xray.ovw
  • If the restraints do not match with the xray structure, does it mean they are wrong?
  • If you tried the two options, what is (are) the difference(s)?

Using Talos to generate torsion angle restraints

Torsion angle restraints from the backbone chemical shifts help restrict angular conformation space. We wish to use only "strong assignments" to generate these restraints.

If you do not have TALOS installed get it from here. It is part of the nmrpipe software package.

Exercise x: Calculate backbone torsion angle restraints using Talos

cp -r enoe acoPREP
cd acoPREP
rm *.tab enoe* *.out *.rho

Hint: You only need the 'init.cya' and the '1.peaks' file to calculate the torsion angle restraints.

Use a text editor of your choice to create a 'CALC_talos.cya' file with the commands to calculate the talos angle restraints:

# convert
read 1.peaks
shifts initialize
shifts adapt
atom set "* shift=990.0.." shift=none
write demo.prot

read prot demo.prot unknown=skip

talos talos=talos+                
talosaco pred.tab

write aco talos.aco

This will call the program TALOS+ and store the resulting torsion angle restraints in the file 'talos.aco'.

Since this is not a calculation suited for the MPI scheduler, start CYANA first, then call the 'CALC_talos.cya' macro from the prompt.


Hint: change to a cshell before running cyana (since talos needs a cshell to run):

csh

Multi-state structure calculation

We will perform calculations based on eNOEs by using torsion angle dynamics in order to compute the three-dimensional structure of the protein.

The 'enoe.upl' and 'enoe.lol' files will be used together with the aco based on chemical shifts of the backbone and scalar couplings from backbone, Ha-HB and aromatic residues determined by experiment.

Exercise x: Calculate a single state structure

Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight.

cp -r flyabb noebb
cd noebb
rm *asn.peaks *exp.peaks *.out *.job
rm -rf details

From the directory 'acoPREP' copy the calculated talos restraints ('talos.aco').

Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined.

The single state CALC macro

restraints:= talos.aco                    		
structures := 100,20                      		
steps:= 10000                       		
randomseed:= 434726    
                  		

To speed up the calculation, you can set optionally in 'CALC_sState.cya':

structures:=50,10
steps=5000

These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept.

When you are done preparing the macros as outlined run the calculation.

The structure calculation will be performed by running the 'CALC_sState.cya' macro:

cyana -n 33 CALC_sState.cya

Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC_sState.cya'.

Carefully analyze the WARNING and ERROR messages if any.


Statistics on the the structure calculation will be displayed to screen.

The final structure will be 'final.pdb'.

Exercise x: A two-state structure calculation

Copy the noebb directory and give it the name noecc, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.

cp -r noebb noecc
cd noecc
rm *cycle* *.out *.job final* rama*

Update the 'init.cya' file in order to read the ligand library file and the

The init macro

The init macro needs to be updated to read the two-state sequence in order to prepare the restraints and run the structure calculation:

cyanalib
if (existfile('bundle.seq')) then
  read seq bundle.seq
  molecules define *
  atom set * vdwgroup=bundle
else
  read seq demo.seq
end if
rmsdrange:=8-33

swap=0
expand=1

atoms stereo "HA? 10"
atoms stereo "HB? 7 8 11 13 14 21 23 24 25 26 27 33 34 35 37 38"
atoms stereo "HG? 8 12 14 17 36 37"
atoms stereo "HD2? 18 26 30"
atoms stereo "QG? 22"
atoms stereo "QD? 7"
atoms stereo "HE2? 33"
atoms stereo "HD? 8 14 21 37"
atoms stereo "HG1? 28"

atoms set "H" protlev=0.97

The PREP macro

The two-state CALC macro

syntax inputseed=@i=3771

if (master) then
  PREP
end if

# ------ Structure calculation ------
read upl bundle.upl
read lol bundle.lol
read aco bundle.aco
read cco bundle.cco
#read rdc bundle.rdc
if (existfile('together.upl')) read upl together.upl append

anneal_weight_rdc := 0.0, 0.5
anneal_weight_aco := 1.0, 1.0, 0.0, 0.0
anneal_weight_cco := 0.0, 0.5

seed=inputseed
calc_all 100 steps=50000

if (master) then
  cut_cco=1.0
  cut_rdc=3.0
  weight_aco = 0.0

  rmsdrange:=8-33,108-133
  overview bundle structures=20 pdb

  read pdb bundle.pdb
  rmsdrange:=11-16,21-26,31-34,111-116,121-126,131-134
  overview bundleSec structures=20 pdb # reference=xxx.pdb

  molecules sort "BACKBONE 26-30" base=1
  write sortStates.pdb all
  SPLIT
end if

These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept.

When you are done preparing the macros as outlined run the calculation.

The structure calculation will be performed by running the 'CALC_sState.cya' macro:

cyana -n 33 CALC_sState.cya

Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC_sState.cya'.

Carefully analyze the WARNING and ERROR messages if any.


Statistics on the the structure calculation will be displayed to screen.

The SPLIT macro

Grouping the coordinates of multi-state calculations

Results: analysis

Exercise xx: Preparing an xray structure to use within CYANA

Deposited structures often lack specific features. i.e. Xray structures usually lack proton coordinates.

Copy your noecc results to a new directory call regulabb, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.

cp -r noecc regulabb
cd regulabb
rm *cycle* *.out *.job

After reading the sequence file, the pdb file can be read with the option unknown=warn or unknown=skip, this will then skip the parts of the molecule not specified in the sequence file.

read pdb xxxx.pdb unknown=warn

Other options to read pdb's:

read 5c5a.pdb unknown=warn hetatm new

where the option 'hetatm' allows for reading of coordinate labeled HETATM, rather than ATOM in the pdb. 'new' will read the sequence from the pdb.

To write back out pdb's and sequences:

write pdb XXX.pdb
write seq XXX.seq  

Inspect the pdb using chimera: Now, there are several issues besides HETATM, that make the comparison to the calculated NMR structure not possible within CYANA before you fix them. You may use a graphical text editor to fix them. In the end, you need to have a conformer of the complex ready to compare with the calculated NMR structure.

Best would be to practice the use of the 'regularize' command as well. This is however not really necessary in this particular case, since this xray structure contains proton coordinates. Using the regularize command one can get a structure calculated within CYANA that has these features but still is very close to the input structure of your choice.


Create an 'init.cya' macro with:

cyanalib

Then create a 'CALC_reg.cya' macro with:

read 5c5a.pdb unknown=warn hetatm new
write 5c5a_Ed.seq 
write 5c5a_Ed.pdb

#renumber and rename the ligand from 201 333, NUT to LIG
library rename "@NUT" residue=LIG
atoms select @LIG
atoms set residue=333

write 5c5a_renum.seq
write 5c5a_renum.pdb

#sequence with ligand but without linker
read demoLongEd.seq
read 5c5a_renum.pdb rigid unknown=warn

write XrayAChainRenum.pdb

initialize

read seq demoLong.seq
read pdb XrayAChainRenum.pdb unknown=warn

write pdb test.pdb

read pdb test.pdb
regularize steps=20000 link=LL keep

Execute the 'CALC_reg.cya' macro in the CYANA shell (or use only one processor, do not distribute the job):

cyana CALC_reg.cya


Download and install the molecular viewer Chimera

  1. Download Chimera (to your personal laptop) from: Chimera

Exercise xx: Single state structure analysis

The final structure will be 'final.pdb'. You can visualize it, for example, with the command

chimera final.pdb

Analyze the result, the bundle seems unnatuarly tight for an NMR structure bundle. Why?

Exercise xx: Calclulate the RMSD of NMR vs. xray structure using a CYANA macro

Using the INCLAN language of CYANA (Writing and using INCLAN macros,Using INCLAN variables,Using INCLAN control statements) it is possible to write complex macros that interact with the FORTRAN code of CYANA. Reading internal variables and manipulating them to achieves custom task.

  • save the manually edited xray structure (exercise 11) or the the regularized xray structure (containing the ligand and called 'regula.pdb') as 'reg_xray.pdb' to use the macro below (or change the name in the macro accordingly).
  • what do you think about the RMSD, does the value make sense? Does the range make sense?

Below you find the commands for a macro (call it 'CALC_RMSD.cya') that will read the regularized xray structure and the calculated nmr structure, then calculating the rmsd of both the protein and ligand parts of the complex:

read demoLong.seq

rmsd range=15-111 structure=final.pdb reference=reg_xray.pdb

atom select "BACKBONE 15-111"
t=rmsdmean
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1) then
    n=n+1
    s=s+displacement(i)
   end if
end do
print "RMSD of the LIG: ${s/n} ($n atoms)"

read pdb final.pdb
structure mean
write pdb mean.pdb

read pdb mean.pdb
read pdb reg_xray.pdb append

atom select "BACKBONE 15-111"
t=rmsdmean
atom select "WITHCOORDALL"
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1.and.asel(i)) then
    n=n+1
    s=s+displacement(i)*2
  end if
end do
print "Displacement of the LIG (to ref xray): ${s/n} ($n atoms)"


Exercise xx: Analysis of multi-state calculations

On improving the final structure

Using what you have learned so far, employing some of the options


General questions to answer regarding this task:

  • Name additional experimental restraints (or inputs) you could use for structure calculation.
  • Name additional NMR experiments you could measure, to acquire experimental data that are not supplied with the demo_data.

eNORA extensions and options

There are a variety of commands to modify eNORA runs to accommodate experimental labeling schemes or etc...


Averaging of spin diffusion over multiple conformers

Three spin approach (TSS)

Labeling Schemes