ENORA and multi-state structure calculations: Difference between revisions
No edit summary |
No edit summary |
||
Line 162: | Line 162: | ||
* '''noRef:''' Number of expected peaks with missing reference shifts | * '''noRef:''' Number of expected peaks with missing reference shifts | ||
* '''noPeak:''' Number of expected peaks for which no peak can be measured | * '''noPeak:''' Number of expected peaks for which no peak can be measured | ||
There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here). | There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here). | ||
Line 212: | Line 208: | ||
csh | csh | ||
== | == Multi-state structure calculation == | ||
We will perform | We will perform calculations based on eNOEs by using torsion angle dynamics in order to compute the three-dimensional structure of the protein. | ||
The ' | The 'enoe.upl and enoe.lol' files will be used together with the aco based on chemical shifts of the backbone and scalar couplings from backbone, Ha-HB and aromatic residues determined by experiment. | ||
=== Exercise | === Exercise x: Calculate a single state structure === | ||
Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight. | Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight. | ||
Line 231: | Line 227: | ||
Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined. | Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined. | ||
==== The | ==== The single state CALC macro ==== | ||
restraints:= talos.aco | restraints:= talos.aco | ||
structures := 100,20 | structures := 100,20 | ||
steps:= 10000 | steps:= 10000 | ||
randomseed:= 434726 | randomseed:= 434726 | ||
To speed up the calculation, you can set optionally in 'CALC_sState.cya': | |||
To speed up the calculation, you can set optionally in ' | |||
structures:=50,10 | structures:=50,10 | ||
Line 250: | Line 242: | ||
These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. | These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. | ||
If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept. | If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept. | ||
When you are done preparing the macros as outlined run the calculation. | When you are done preparing the macros as outlined run the calculation. | ||
The | The structure calculation will be performed by running the 'CALC_sState.cya' macro: | ||
cyana -n 33 | cyana -n 33 CALC_sState.cya | ||
Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 | Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC_sState.cya'. | ||
Statistics on the | Statistics on the the structure calculation will be displayed to screen. | ||
The final structure will be 'final.pdb'. You can visualize it, for example, with the command | The final structure will be 'final.pdb'. You can visualize it, for example, with the command |
Revision as of 13:21, 2 April 2019
In this tutorial we will provide you with a guided example for calculating eNOEs and a multi-state structure calculation.
To this end we will first run the modules of eNORA within CYANA and then use the obtained eNOEs to calculate a single state and a two-state structure model using automated sorting to separate the states. Along the way you will learn some additional CYANA skills useful for other purposes as well.
To finalize you will .... And ultimately you can try to improve ....
CYANA setup
Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):
- Go to your home directory (or data directory).
- Get the data for the practical from the server (eNORA_multiState.tgz).
- Unpack the input data for the practical.
- Get the demo version of CYANA for this practical.
- Unpack CYANA.
- Setup the CYANA environment variables.
- Change into the newly created directory 'eNORA'.
- Copy the demo_data directory to 'enoe'.
- Change into the subdirectory 'enoe'.
- Test whether CYANA can be started by typing its name, 'cyana'.
- Exit from CYANA by typing 'q' or 'quit'.
- Download Chimera (to your personal laptop) from: Chimera
cd ~ wget 'http://www.cyana.org/wiki/images/6/64/eNORA_multiState.tar.gz'
tar zxf eNORA_multiState.tar.gz wget 'http://www.cyana.org/wiki/images/6/64/Cyana-3.98.9_Demo.tgz' tar zxf Cyana-3.98.9_Demo.tgz cd cyana-3.98.9/ ./setup cd ~ cd eNORA cp -r demo_data enoe cd enoe
cyana ___________________________________________________________________ CYANA 3.98 (mac-intel) Copyright (c) 2002-17 Peter Guentert. All rights reserved. ___________________________________________________________________ Demo license valid for specific sequences until 2018-12-31 cyana> q
If all worked, you are ready to go in terms of everything related to CYANA!
If you want to return to your practical later, using your own Linux or Mac OS X computer, you can download the demo version of CYANA from [www.cyana.org/wiki/images/6/64/Cyana-3.98.9_Demo.tgz here].
Hint: More information on the CYANA commands etc. is in the CYANA 3.0 Reference Manual.
eNOE calculations
All eNOE related calculations within cyana are carried out using the eNORA modules.
NOESY experiment measured at different mixing times (keeping the mixing times as much as possible within the linear regime of NOE buildup) supply very precise distance restraints used for a structure calculation. In addition other restraints such as backbone angles from chemical shifts and scalar couplings for backbone and aromatic side chains are also used.
Experimental input data
Peak lists in XEASY format are prepared by automatic peak picking with a visualization program such as CcpNmr Analysis, NMRdraw or NMRview and saved as XXX.peaks, where XXX denotes the name of the xeasy peak list file. Since NMRdraw peak lists are of different file type, cyana provides the command read tab to convert the files to XEASY format.
# Number of dimensions 3 #FORMAT xeasy3D #INAME 1 H #INAME 2 HN #INAME 3 N #SPECTRUM N15NOESY H HN N 17086 4.098 4.099 57.441 1 U 6.990943E+08 0.000000E+00 e 0 HA.5 HA.5 CA.5 89532 4.355 1.829 33.507 1 U 1.720779E+06 0.000000E+00 e 0 HA.6 HB2.6 CB.6 89544 4.353 1.757 33.513 1 U 2.939628E+06 0.000000E+00 e 0 HA.6 HB3.6 CB.6
The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' (no space between characters) lines gives the experiment type (N15NOESY, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (H HN N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. The experiment type and identifiers must correspond to an experiment definition in the general CYANA library (see below) in most uses of the definition, here however we cheat slightly and get away with it. We are cheating, because for eNOE calculations we record our NOESY spectra with simultanous evolution of 13C and 15N dimensions, since we require 15N and 13C bound spins within the same spectrum for purposes of normalization (see...).
After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has
- Peak number 17086
- H chemical shift 4.098 ppm
- ("HN") chemical shift 4.099 ppm (in this case 13C bound)
- Heavy atom chemical shift 57.441 ppm (in this case 13C labeled)
The other data are relevant entry for the eNOE mudules is the peak volume or intensity (6.990943E+08).
Hint: The formats of other CYANA files are described in the CYANA 3.0 Reference Manual.
The protein sequence is supplied by three-letter code in a XXX.seq file.
As part of the supplied data for the exercises there are two sequences:
- demo.seq
SPECTRUM definitions in the CYANA library
When you start CYANA, the program reads the library and displays the full path name of the library file. You can open the standard library file to inspect, for example, the NMR experiment definitions . For instance, the definition for the N15NOESY spectrum (search for 'N15NOESY' in the library file 'cyana.lib') is
SPECTRUM HNCA HN N C 0.980 HN:H_AMI N:N_AM* C:C_ALI C_BYL 0.800 HN:H_AMI N:N_AMI (C_ALI) C_BYL C:C_ALI
The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and identifies the atoms that are detected in each dimension of the spectrum. The number of identifiers defines the dimensionality of the experiment (3 in case of HNCA).
Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of HNCA there are two lines, corresponding to the intraresidual and sequential peak. For instance, the definition for the intraresidual peak starts with the probability to observe the peak (0.980), followed by a series of atom types, e.g. H_AMI for amide proton etc. An expected peak is generated for each molecular fragment in which these atom types occur connected by single covalent bonds. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for HNCA 'HN:', 'N:', and 'C:'.
Hint: For information on how to use the vi terminal editor: vi editor
eNORA
- work in the copy of the data directory ('cd enoe')
Using the text editor of your choice, create your 'init.cya' macro as outlined (The init macro) and also your 'CALC.cya' macro (The FLYA CALC macro) to run FLYA. Be extra careful to avoid typos and unwanted spaces in coma lists etc.
Execution scripts or "macros" in CYANA
For more complex task within CYANA, rather than to enter the execution commands line by line at the CYANA prompt, the necessary commands are collected in a file named '*.cya'. Collecting the commands in macros has the added advantage, that the macros serve as a record allowing to reconstruct previous calculations.
The init macro
The initialization macro file has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands that read the CYANA library and the protein sequence:
rmsdrange:=15-111 cyanalib read demoShort.seq
The first line sets the appropriate rmsdrange, and the command 'cyanalib' reads the standard CYANA library. The next command reads the protein sequence.
The protein sequence is stored in three-letter code in the file 'demo.seq'.
The eNORA CALC macro
The 'CALC_enoe.cya' starts with the specification of the names of the input peak lists:
- The input peak lists that will be used (as defined above).
When you have prepared the 'init.cya' and the 'CALC_enoe.cya' try to run the macro.
To run the FLYA calculation, one could start CYANA and execute the 'CALC.cya' macro from the CYANA prompt, however on a computer with multiple processors it is better to speed up the calculation by running the 'CALC.cya' macro in parallel:
cyana CALC_enoe.cya
eNORA output files
The FLYA algorithm will produce the following output files:
- enoe.ovw: Consensus ....
The enoe.ovw file
- #Expected: Total number of expected peaks
- noRef: Number of expected peaks with missing reference shifts
- noPeak: Number of expected peaks for which no peak can be measured
There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here).
Exercise x: compile the autorelaxation file
Using Talos to generate torsion angle restraints
Torsion angle restraints from the backbone chemical shifts help restrict angular conformation space. We wish to use only "strong assignments" to generate these restraints.
If you do not have TALOS installed get it from here. It is part of the nmrpipe software package.
Exercise x: Calculate backbone torsion angle restraints using Talos
Hint: Copy the FLYA results into a new folder, since otherwise you will overwrite your original 'flya.prot' file.
Essentially you will need to copy the details directory and the 'flya.prot' file.
cp -r flyabb acoPREP cd acoPREP rm *.peaks *.out *.job
Use a text editor of your choice to create a 'CALC.cya' file with the commands to calculate the talos angle restraints.
TALOS is used to generate torsion angle restraints from the backbone chemical shifts in 'flya.prot'.
consolidate reference=flya.prot file=flya.tab plot=flya.pdf prot=details/a[0-9][0-9][0-9].prot
This overwrites the original flya.prot with only strong assignments.
read prot flya-strong.prot unknown=skip talos talos=talos+ talosaco pred.tab write aco talos.aco
This will call the program TALOS+ and store the resulting torsion angle restraints in the file 'talos.aco'.
Since this is not a calculation suited for the MPI scheduler, start CYANA first, then call the 'CALC.cya' macro from the prompt.
Hint: change to a cshell before running cyana (since talos needs a cshell to run):
csh
Multi-state structure calculation
We will perform calculations based on eNOEs by using torsion angle dynamics in order to compute the three-dimensional structure of the protein.
The 'enoe.upl and enoe.lol' files will be used together with the aco based on chemical shifts of the backbone and scalar couplings from backbone, Ha-HB and aromatic residues determined by experiment.
Exercise x: Calculate a single state structure
Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight.
cp -r flyabb noebb cd noebb rm *asn.peaks *exp.peaks *.out *.job rm -rf details
From the directory 'acoPREP' copy the calculated talos restraints ('talos.aco').
Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined.
The single state CALC macro
restraints:= talos.aco structures := 100,20 steps:= 10000 randomseed:= 434726
To speed up the calculation, you can set optionally in 'CALC_sState.cya':
structures:=50,10 steps=5000
These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept.
When you are done preparing the macros as outlined run the calculation.
The structure calculation will be performed by running the 'CALC_sState.cya' macro:
cyana -n 33 CALC_sState.cya
Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC_sState.cya'.
Statistics on the the structure calculation will be displayed to screen.
The final structure will be 'final.pdb'. You can visualize it, for example, with the command
chimera final.pdb
The optimal residue range for superposition can be found with the command
cyana overlay final.pdb
Run noeassign with your 'CALC.cya' macro.
You can check the statistics (and success of 'noeassign') by running:
cyanatable
Creating the ligand library file for CYANA
In the next three exercises you will create the ligand library file for CYANA from scratch. Do this carefully and check your result, otherwise your structure calculation will not work as intended.
Exercise 6: Drawing the molecule and obtaining the SMILES code
- make a copy of the libex and work in there (libexbb)
cp -r libex libexbb cd libexbb
Go to the ZINC website.
Click on the Structure tab and draw the molecule using the supplied drawing (LIG.png) of the compound as a guide. Copy the SMILES code.
Hint: To look at the supplied image file in the terminal, use:
xdg-open LIG.png
Exercise 7: Converting the SMILES code to mol2
- work in the copy of the libex directory ('cd libexbb')
There are many options and programs to do this, we outline two:
If you can use Avogadro (best):
For Mac OS download Avogadro from: Avogadro
Build -- > Insert --> SMILES
Paste the SMILES code
Extensions -- > Optimize Geometry
Save as
--> LIG.mol2 (*.mol2)
If you have to use chimera:
(If you are on the linux server chimera is installed)
Tools --> Structure Editing --> Build Structure Start Structure
--> SMILES string
set the Residue name to LIG (capital letters)
--> Apply Save your mol2 file as: LIG.mol2
Now, there is one issue we have to take care of: The intermolecular NOE assignments have to match the ligand structure assignment, otherwise the intermolecular NOEs will be wrong.
Using the text editor of your choice, manually change the "UNL1" in your mol2 file to "LIG".
Then open the supplied demoLIG.pdb structure in chimera, as well as the created mol2 structure.
chimera demoLIG.pdb LIG.mol2
First check the geometry (especially the rings and the stereochemistry). If it is wrong fix it!
Using the text editor of your choice, or more conveniently using chimera change the proton names in your mol2 file to match those of the pdb.
Hint: Overlay the two ligand structures in chimera.
Favorites --> Command Line
In the command line enter:
match #1 #0
Depending on how the models are loaded you may need to change the #? numbers. To see the model number use the Favorites --> Model Panel.
If chimera complains about "Unequal numbers of atoms chosen for evaluation", delete the pseudo atoms of 'demoLIG.pdb' temporarily for the overlay. In the command line
sel: @Q @Q? @Q?? delete sel
To rename selected atoms (control click) in the command line:
setattr a name HX sel
Hovering over atoms will display their names!
Exercise 8: Converting the mol2 file to a lib file for CYANA
- work in the copy of the libex directory ('cd libexbb')
- unpack the tool to convert the mol2 to a *.lib file
tar zxf cylib-2.0.tgz
run cylib with the options -nc -sc
./cylib-2.0/cylib -nc -sc LIG.mol2
this will create the LIG.lib file.
The -sc option keeps the angles of the rings fixed. We can do this since they are in this molecule either aromatic or have sp3 conjugated carbons in them, fixing the ring geometry. If they had to be flexible, you would need to keep the angeles flexible and supply additional restraints to close the rings.
To test the lib file we need CYANA:
Create a sequence file containing 'LIG 333' and name the file 'LIG.seq'.
Start CYANA This will read the CYANA library file correctlly but give you the error:
*** ERROR: Illegal residue name "LIG". *** ERROR: Cannot read line 1: LIG 333
Because we do not have an init file and have not read the 'LIG.lib' file yet, the program just tries to read the default sequence file in the directory, but the ligand is not yet in the library, so it fails...
read lib LIG.lib append read seq LIG.seq anneal atoms select "* - &DUMMY" pseudo=1 write pdb test.pdb selected
the command pseudo=1 ensures that the pseudo atoms will be in the written pdb file, 'atoms select "* - &DUMMY"' followed by 'write *.pdb selected' prevents the dummy atoms of the linker to be written to pdb.
Hint: Since you might have to do this a few times, until the library is working and correct, it might be worthwhile to create a 'init.cya' and a 'CALC.cya' macro with the respective commands. This to speed things up and prevent the error output shown above.
Carefully analyze the WARNING and ERROR messages if any.
Then take a look at your lig.pdb in chimera and check that the chemistry and bonds are all as expected (ring closure!)
chimera test.pdb
Again overlay the 'LIG.pdb' with the provided 'demoLIG.pdb'.
If there are any issues "go back to the drawing board" to fix the issues. Carefully check the names also of the pseudo atom names, since they are used in intermolecular-NOEs later.
To help find problems, you may use the command:
write lib LIG.lib names
This will write the library file containing actual atom names rather than numbers.
Alternative Exercise 6-8: Converting a pdb file to a lib file for CYANA
In case you were unsuccessful with exercises 6-8 in terms of getting a working ligand library file, do not dispair! There is an easy workaround that you may be able to use in the real case as well, converting a pdb file to a library file for CYANA.
Use Avogadro:
File --> Open
Open the 5c5aLig.pdb
Save as
LIG.mol2 (*.mol2)
Rename the Residue to LIG in the LIG.mol2 file.
./cylib-2.0/cylib -nc -sc LIG.mol2
Done!
You can run the tests outlined above, using anneal etc to test your library file.
Calculating the structure of the protein-ligand complex
Exercise 9: (Semi-automatic) Intermolecular cross peaks assignment and structure calculation
Since the molecular system contains protein and ligand, CYANA has to read the 'LIG.lib' file in addition to the regular 'cyana.lib' file. The sequence file needs to contain the protein and the ligand (and a linker to connect the two).
Copy the noebb directory and give it the name noecc, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.
cp -r noebb noecc cd noecc rm *cycle* *.out *.job final* rama*
Update the 'init.cya' file in order to read the ligand library file and the sequence file containing the linker and the ligand.
Ad the 'read lib LIG.lib append' following the 'cyanalib' read command but before reading the sequence. 'append' is necessary, otherwise the 'cyana.lib' file will be overwritten by the 'LIG.lib' file.
rmsdrange:=15-111,333 cyanalib read lib LIG.lib append read seq demoLong.seq
Intermolecular cross peaks we assign by supplying noeassign an intermolecular xeasy peak list with just the ligand resonances assigned. The ligand resonance were assigned manually and determined from an additional set of experiments (the semi-automatic part). Thereby the resonance assignment matches the ligand atom assignment in the library file created in the previous exercise.
The protein side will then be assigned by noeassign.
Update the your previous 'CALC.cya' macro by adding the intermol-NOEs.peaks to the peaks list and adding the keep=all option to the noeassign command:
peaks:= cnoesy.peaks,nnoesy.peaks,aro.peaks,intermol-NOEs.peaks prot:= flya.prot restraints:= talos.aco tolerance:= 0.040,0.030,0.45 structures := 100,20 steps:= 10000 randomseed:= 434726 write_peaks_names=.true. assign_noartifact:="** list=intermol-NOEs.peaks" noeassign peaks=$peaks prot=$prot keep=all selectcombine="* - @LIG" autoaco
The command 'assign_noartifact' effectivly disables network anchoring tests for the ligand. Since the list supplied is cleaned and presumed artifact free, we are allowed to do this. We therby encourage the use of the intermolecular NEOs even if the support by other nearby NOEs is weak. The command 'write_peaks_names=.true.' ensures that the assigned peak list are written to file with the actual resonance names (this is not xeasy standard).
You can run the calculation again, commenting out (#) the 'assign_noartifact' command, and see the effect on the final structure.
'selectcombine' calls for testing for errors to be done different: Intermolecular peaks do not have to compete with intra protein peaks.
Run the calculation:
cyana -n 33 CALC.cya
Comparing the calculated NMR structure to an XRAY reference structure
Exercise 10: Compare the NMR structure to the Xray structure
Download (www.rcsb.org) the xray structure with ID: 5c5a
Use either a web-browser or the terminal:
wget 'https://files.rcsb.org/download/5c5a.pdb'
Using chimera it is possible to compare two structures, by overlaying and inspecting visually.
When you have your xray structure ready, load your calculated nmr structure and the xray structure in chimera.
Use to chimera specific commands to overlay the two structures and compare the structures visually.
Exercise 11: Preparing an xray structure to use within CYANA
Deposited structures often lack specific features. i.e. Xray structures usually lack proton coordinates.
Copy your noecc results to a new directory call regulabb, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.
cp -r noecc regulabb cd regulabb rm *cycle* *.out *.job
After reading the sequence file, the pdb file can be read with the option unknown=warn or unknown=skip, this will then skip the parts of the molecule not specified in the sequence file.
read pdb xxxx.pdb unknown=warn
Other options to read pdb's:
read 5c5a.pdb unknown=warn hetatm new
where the option 'hetatm' allows for reading of coordinate labeled HETATM, rather than ATOM in the pdb. 'new' will read the sequence from the pdb.
To write back out pdb's and sequences:
write pdb XXX.pdb write seq XXX.seq
Inspect the pdb using chimera: Now, there are several issues besides HETATM, that make the comparison to the calculated NMR structure not possible within CYANA before you fix them. You may use a graphical text editor to fix them. In the end, you need to have a conformer of the complex ready to compare with the calculated NMR structure.
Best would be to practice the use of the 'regularize' command as well. This is however not really necessary in this particular case, since this xray structure contains proton coordinates. Using the regularize command one can get a structure calculated within CYANA that has these features but still is very close to the input structure of your choice.
Copy your 'LIG.lib' file and name it 'NUT.lib', in the 'NUT.lib' file change the residue name from LIG to NUT. The 'NUT.lib' file is necessary to read the original xray structure with ligand into CYANA.
Copy the 'demoLong.seq' file and name it 'demoLongEd.seq', in the 'demoLongEd.seq' file delete the linker residues.
Create an 'init.cya' macro with:
cyanalib read lib LIG.lib append
Then create a 'CALC_reg.cya' macro with:
read lib NUT.lib append read 5c5a.pdb unknown=warn hetatm new write 5c5a_Ed.seq write 5c5a_Ed.pdb #renumber and rename the ligand from 201 333, NUT to LIG library rename "@NUT" residue=LIG atoms select @LIG atoms set residue=333 write 5c5a_renum.seq write 5c5a_renum.pdb #sequence with ligand but without linker read demoLongEd.seq read 5c5a_renum.pdb rigid unknown=warn write XrayAChainRenum.pdb initialize read seq demoLong.seq read pdb XrayAChainRenum.pdb unknown=warn write pdb test.pdb read pdb test.pdb regularize steps=20000 link=LL keep
Execute the 'CALC_reg.cya' macro in the CYANA shell (or use only one processor, do not distribute the job):
cyana CALC_reg.cya
Exercise 12: Calclulate the RMSD of NMR vs. xray structure using a CYANA macro
Using the INCLAN language of CYANA (Writing and using INCLAN macros,Using INCLAN variables,Using INCLAN control statements) it is possible to write complex macros that interact with the FORTRAN code of CYANA. Reading internal variables and manipulating them to achieves custom task.
- save the manually edited xray structure (exercise 11) or the the regularized xray structure (containing the ligand and called 'regula.pdb') as 'reg_xray.pdb' to use the macro below (or change the name in the macro accordingly).
- what do you think about the RMSD, does the value make sense? Does the range make sense?
Below you find the commands for a macro (call it 'CALC_RMSD.cya') that will read the regularized xray structure and the calculated nmr structure, then calculating the rmsd of both the protein and ligand parts of the complex:
read demoLong.seq rmsd range=15-111 structure=final.pdb reference=reg_xray.pdb atom select "BACKBONE 15-111" t=rmsdmean j=rindex('333') n=0 s=0.0 do i ifira(j) ifira(j+1)-1 if (element(i).gt.1) then n=n+1 s=s+displacement(i) end if end do print "RMSD of the LIG: ${s/n} ($n atoms)" read pdb final.pdb structure mean write pdb mean.pdb read pdb mean.pdb read pdb reg_xray.pdb append atom select "BACKBONE 15-111" t=rmsdmean atom select "WITHCOORDALL" j=rindex('333') n=0 s=0.0 do i ifira(j) ifira(j+1)-1 if (element(i).gt.1.and.asel(i)) then n=n+1 s=s+displacement(i)*2 end if end do print "Displacement of the LIG (to ref xray): ${s/n} ($n atoms)"
Beyond The Basics: Improving the final structure
FLYA options
There are a variety of commands to modify FLYA runs to accommodate experimental labeling schemes or apply previous assignments etc...
Modify the chemical shift statistics used for assignment
Supply user-defined chemical shift statistics instead of standard BMRB statistics from library and replace the general statistics from 'cyana.lib' (CSTABLE).
- average value and stddev from input chemical shift list 'shiftx.prot'
- 'assigncs_sd:=bmrb' to use stddev from BMRB ('cyana.lib') instead of input chemical shift list
- 'assigncs_sdfactor:=0.5' to scale BMRB stddev by given factor
shiftassign_statistics:=predicted.prot
Modify the reported statistics
Groups of atoms for which assignment statistics will be calculated and reported in the 'flya.txt' output file can be defined as:
analyzeassign_group := BB: N H CA CB C
In this case, the command defines a group called BB (a name that can be chosen freely) comprising the atoms N, H, CA, CB, C.
The optional parameter 'shiftreference=manREF.prot' specifies reference chemical shift list, used only for comparison in flya.tab, flya.txt, flya.pdf:
shiftassign_reference:=manREF.prot
The same parameter may also be set as part of the flya command:
flya runs=10 assignpeaks=$peaks shiftreference=manREF.prot
Modify the expected peak lists
Specific labeling can be handled and peak list-specific atom selections can be applied.
To restrict the generation of expected peaks to a subset of atoms, here the backbone atoms:
command select_atoms atom select "N H CA CB C" end
Input structures may be used to generate expected peaks for through-space experiments:
- specify with parameter 'structure' of the command 'flya'
- if parameter 'structure' is absent, a set of random structures is generated automatically
- if set to blank ('structure='), no random structures are generated (if not needed because only through-bond spectra are used)
flya runs=10 assignpeaks=$peaks structure=XXX.pdb
Experimental peaks may also be employed as expected peak lists:
- command N15NOESY_expect, reading input peak list N15NOESY_in.peaks
N15NOESY_expect :=N15NOESY_in
Keeping previously determined assignments
To keep input peak assignments in user peak assignments:
- (partially) assigned input peak list XXX.peaks
- parameter 'keepassigned' for 'loadspectra.cya'
loadspectra_keepassigned:=.true.
To fix input chemical shift assignments contained in a prot file
To do this i.e for backbone atoms extracted from the manREF.prot list:
Make a list of only the reference backbone chemical shifts by entering the CYANA commands:
read manREF.prot atom set "* - H N CA CB C" shift=none write fix.prot
The file 'fix.prot' will contain the reference chemical shifts only for the backbone (and CB) atoms H, N, CA, CB, C'. Now you can repeat the assignment calculation by inserting the 'shiftassign_fix:=fix.prot' statement in 'CALC.cya' and choosing only the input peak lists that are relevant for sidechain assignment:
shiftassign_fix:=fix.prot
Chemical shift assignment using exclusively NOESY
- increased population size with 'shiftassign_population=200'
- see Schmidt et al. J. Biomol. NMR 57, 193-204 (2013)
Speeding up FLYA runs
Serves the fast automated chemical shift assignment and means the results in general are less accurate since either the populations are smaller, there are less parallel runs or the optimization schedule is modified.
In production runs, better results can be expected (at the expense of longer computation times) if these parameters are not set.
There are three parameters of the assignment algorithm that can be set in order to speed up the calculation.
Fixed number of generations in evolutionary optimization:
shiftassign_population=25
The population size for the genetic algorithm, i.e. how many assignments form one generation (25; chosen smaller than in normal production runs in order to speed up the calculation).
There is also an option to choose the "quick" optimization schedule:
shiftassign_quick=.true.
And last the 'runs' option can be set for flya as we did in the exercise ('flya runs=10').
neoassign options
To learn more about noeassign consult the tutorial Structure calculation with automated NOESY assignment. Other options for neoassign are described here: CYANA_Macro:_noeassign
Exercise 13: Mapping restraints onto a known structure
One can map the calculated restraints, such as distance restraints (upl/lol) onto a known structure (in the example here an xray structure). This is another approach to analyze restraints and their influence on the results.
Below you find the commands to accomplish this. You see by studying the commands, which files are needed to execute the macro. Therefore, create a new directory ('mkdir') or copy a directory containing the respective files. Delete what you do not need. Use the regularized xray structure from exercise 11.
Commands preceded by hashtags (#) are commented out, remove the hashtags if you want to use them. If you decide to use the intermo-NOEx-cycle7.peaks file, make sure to comment any commands you no longer need.
You need an init file:
rmsdrange:=15-111,333 cyanalib read lib LIG.lib append
And the main macro (name it 'CALC_xraymap.cya'):
read seq demoLong.seq
The following block of commands, takes the assigned intermol.peaks list and calculates distance restraints from the peak intensities:
#peaks:=intermol-NOEs-cycle7.peaks #calibration peaks=$peaks #peaks calibrate simple #write upl intermol.upl
The following block of commands, reads the 'final.upl' list (in this case of neoassign) and selects the intermolecular NOEs to LIG and writes them to file:
read upl final.upl distance select "*, @LIG" info=full write intermol.upl
read intermol.upl unknown=warn #read upl lig.upl append #read lol lig.lol read regula.pdb unknown=warn weight_vdw=0 overview intermol_xray.ovw
- If the restraints do not match with the xray structure, does it mean they are wrong?
- If you tried the two options, what is (are) the difference(s)?
- Did you look at the LIG.upl/lol files in the demo_data folder, what are they? What type of NMR experiments are there to obtain them?
Exercise 14: Work on improving the final structure
Using what you have learned so far, employing some of the options of FLYA and noeassign, consider if it is possible to improve the resolution of the final structure.
General questions to answer regarding this task:
- Name additional experimental restraints (or inputs) you could use for structure calculation.
- Name additional NMR experiments you could measure, to acquire experimental data that are not supplied with the demo_data.