Automated calculation of a protein-ligand complex structure (AUREMN, Brazil 2018): Difference between revisions

From CYANA Wiki
Jump to navigation Jump to search
No edit summary
 
(644 intermediate revisions by 2 users not shown)
Line 1: Line 1:
In this tutorial we will determine the resonance assignments and the structure of a protein-ligand complex using modules of CYANA.  
In this tutorial we will determine the resonance assignments and the structure of a protein-ligand complex using modules of CYANA.  
To this end we will create a ligand.mol2 and ligand.lib file.


== CYANA setup for the AUREMN Practical Course NMR in Campino (24-26 February 2018) ==
To this end we will first run the CYANA module FLYA to obtain the resonance assignments from backbone, side chain and NOESY experiments  (actually, the XEASY peak lists of these experiments).
 
Then we will use noeassign to assign the NOESY spectra and calculate the holo protein structure without the ligand.
 
In a next step we will first draw the ligand, convert the obtained SMILES code to a *.mol2 file and generate the *.lib file for CYANA.
 
Then we will assign intermolecular peaks lists and redo the structure calculation, this time of the protein-ligand complex.
 
To finalize you will compare the calculated NMR structure to an X-ray structure and generate statistics.
 
And ultimately you can try to improve your structure results by studying and applying the options available within the FLYA and noeassign modules of CYANA.
 
 
== CYANA setup for the AUREMN Practical NMR Course in Campino (24-26 February 2018) ==


Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):
Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):


# Go to your home directory.
# Go to your home directory (or data directory).
# Get the input data for the practical from the server.
# Get the data for the practical from the server (AUREMN2018.tgz).
# Unpack the input data for the practical.
# Unpack the input data for the practical.
# Change into the newly created directory 'cyana'
# Get the demo version of CYANA for this practical.
# Run the setup script 'setupcyana'.
# Unpack CYANA.
 
# Setup the CYANA environment variables.
cd ~
# Change into the newly created directory 'AUREMN2018'.
cp -r xxxx/xxx/xxx/CyanaAUREMN2018.tgz
# Copy the demo_data directory to 'flyabb'.
tar zxf CyanaAUREMN2018.tgz
cd cyana
./setupcyana
# <li value="6"> Copy the directory 'flyaquick' containing the input data for the practical to 'flyabb'.
# Change into the subdirectory 'flyabb'.
# Change into the subdirectory 'flyabb'.
# Test whether CYANA can be started by typing its name, 'cyana'.
# Test whether CYANA can be started by typing its name, 'cyana'.
# Exit from CYANA by typing 'q' or 'quit'.
# Exit from CYANA by typing 'q' or 'quit'.
# Download Chimera (to your personal laptop) from: [https://www.cgl.ucsf.edu/chimera/download.html Chimera]
# Download Avogadro (to your personal laptop) from: [https://avogadro.cc/ Avogadro]


  cp -r flyaquick flyabb  
cd ~
cp /home/julien/AUREMN2018.tar.gz .
tar zxf AUREMN2018.tar.gz
wget <nowiki>'http://www.cyana.org/wiki/images/6/64/Cyana-3.98bin-180213Demo.tgz'</nowiki>
tar zxf Cyana-3.98bin-180213Demo.tgz
cd cyana-3.98/
./setup
cd ~
cd AUREMN2018
  cp -r demo_data flyabb  
  cd flyabb
  cd flyabb
<!---
../../cyana-3.98/cyana
--->
  cyana
  cyana
  ___________________________________________________________________
  ___________________________________________________________________
   
   
  CYANA 3.98 (linux64-intel)
  CYANA 3.98 (mac-intel)
   
   
  Copyright (c) 2002-17 Peter Guentert. All rights reserved.
  Copyright (c) 2002-17 Peter Guentert. All rights reserved.
  ___________________________________________________________________
  ___________________________________________________________________
   
   
     Demo license valid for specific sequences until 2017-12-31
     Demo license valid for specific sequences until 2018-12-31
   
   
     Library file "/home/guentert_l/cyana/cyana-3.98/lib/cyana.lib" read, 41 residue types.
     Library file "/Users/deans/cyana-3.98/lib/cyana.lib" read, 41 residue types.
    Sequence file "demo.seq" read, 114 residues.
*** ERROR: Illegal residue name "LIG".
*** ERROR: Cannot read line 114:
            LIG  333
  cyana> q
  cyana> q


If all worked, you are ready to go!
If all worked, you are ready to go in terms of everything related to CYANA! The reason you see the ERROR message, is because you have a sequence file in the directory, but not a library file for the ligand yet.
Don't worry, this is as expected and you will take care of it during the exercise.


If you want to return to your practical later, using your own Linux or Mac OS X computer, you can download the demo version of CYANA from [[Media:cyana-3.98bin-170805Demo.tgz‎|here]].
If you want to return to your practical later, using your own Linux or Mac OS X computer, you can download the demo version of CYANA from [http://www.cyana.org/wiki/images/6/64/Cyana-3.98bin-180213Demo.tgz here].


'''Hint:''' More information on the CYANA commands etc. is in the [[CYANA 3.0 Reference Manual]].
'''Hint:''' More information on the CYANA commands etc. is in the [[CYANA 3.0 Reference Manual]].


== Experimental input data ==
== Automated resonance assignment ==
Resonance assignment within cyana is done using the module FLYA.
 
In the most general sense, there are two type of experiments used for protein resonance assignments.
Through bond, TOSCY kind of experiments and through space NOESY type of experiments.
Each of these two experiments carries distinct information that help the resonance assignment.
The HSQC, HMQC or TROSY elements of these experiments merely help the resolution, by allowing the separation of resonances according to spin types (1H, 13C, 15N) into additional dimensions.
 
At the very minimum, for small systems and in favorable cases, a NOESY experiment may be sufficient to get an assignment and enough distance restraints for a structure calculation.
 
=== Experimental input data ===


The protein sequence is stored in three-letter code in the file 'demo.seq'.
Spectra are processed and referenced relative to each other. Peak lists in XEASY format are prepared by automatic peak picking with a visualization program such as CcpNmr Analysis, NMRdraw or NMRview and saved as ''XXX''.peaks, where ''XXX'' denotes the name of the xeasy peak list file.
Then they are cleaned (unnecessary water and noise peaks removed).  


Experimental peak lists are available for the following spectra:
As part of the data supplied for the exercises, experimental peak lists are available for the following spectra:
* HNtrosy            (spectrum type  'N15HSQC' in the CYANA library)
* HNtrosy            (spectrum type  'N15HSQC' in the CYANA library)


Line 54: Line 97:
* HNCOCA          (spectrum type 'HNcoCA' in the CYANA library)
* HNCOCA          (spectrum type 'HNcoCA' in the CYANA library)
* HNCACB            (spectrum type 'CBCANH' in the CYANA library)
* HNCACB            (spectrum type 'CBCANH' in the CYANA library)
* HCCCHTOCSY  (the spectrum type will have to be determined)
* HCCCHTOCSY  (the spectrum type will have to be determined in the first exercise)
* NTOCSY            (spectrum type 'N15TOCSY' in the CYANA library)
* NTOCSY            (spectrum type 'N15TOCSY' in the CYANA library)


* 3D [<sup>13</sup>C]-resolved NOESY  called aro       (spectrum type 'C13NOESY' in the CYANA library)
* 3D <sup>13</sup>C-resolved NOESY  called aro (spectrum type 'C13NOESY' in the CYANA library)
* 3D [<sup>13</sup>C]-resolved NOESY  called cnoesy (spectrum type 'C13NOESY' in the CYANA library)
* 3D <sup>13</sup>C-resolved NOESY  called cnoesy (spectrum type 'C13NOESY' in the CYANA library)
* 3D [<sup>15</sup>N]-resolved NOESY  called nnoesy (spectrum type 'N15NOESY' in the CYANA library)
* 3D <sup>15</sup>N-resolved NOESY  called nnoesy (spectrum type 'N15NOESY' in the CYANA library)
 
Peak lists in XEASY format that have been prepared by automatic peak picking with the program NMRView are stored in files ''XXX''.peaks, where ''XXX'' denotes the FLYA spectrum type.


Each peak list starts with a header that defines the experiment type and the order of dimensions. For instance, for HNCA.peaks:
Each peak list starts with a header that defines the experiment type and the order of dimensions. For instance, for HNCA.peaks:
Line 75: Line 116:
       7  6.475  54.017  98.159 1 U  2.547E+01  0.000E+00 e 0    0    0    0
       7  6.475  54.017  98.159 1 U  2.547E+01  0.000E+00 e 0    0    0    0


The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' lines gives the experiment type (HNCA, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (HN C N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. These labels must match those in the corresponding experiment definition in the general CYANA library (see below). After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has
The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' (no space between characters) lines gives the experiment type (HNCA, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (HN C N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. The experiment type and identifiers must correspond to an experiment definition in the general CYANA library (see below). If a definition is missing for an experiment it must be added to the CYANA library. After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has


* Peak number 5
* Peak number 5
Line 86: Line 127:
'''Hint:''' The formats of other CYANA files are described in the [[CYANA 3.0 Reference Manual]].
'''Hint:''' The formats of other CYANA files are described in the [[CYANA 3.0 Reference Manual]].


<!--
== FLYA initialization script ==


The CYANA commands to run the automated assignment calculation are stored in two CYANA scripts or "macros".  
The protein sequence is supplied by three-letter code in a XXX.seq file.


One has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program. It contains normally at least two commands that read the CYANA library and the protein sequence:
As part of the supplied data for the exercises there are two sequences:
* demoShort.seq              (the protein sequence alone)
* demoLong.seq              (the protein sequence, ligand and a linker that connects the two molecules)


cyanalib
Linker sequences serve to keep two or more molecules close in coordinate space during calculations, is usually between 15-20 elements long and is composed of dummy atoms that allow the linking.
read demo.seq
 
The command 'cyanalib' reads the standard CYANA library. The second command reads the protein sequence.
-->


== Experiment definitions in the CYANA library ==
==== SPECTRUM definitions in the CYANA library ====


When you start CYANA, the program reads the library and displays the full path name of the library file. You can open the standard library file to inspect, for example, the NMR experiment definitions that define which expected peaks are generated by FLYA. For instance, the definition for the HNCA spectrum (search for 'HNCA' in the library file 'cyana.lib') is
When you start CYANA, the program reads the library and '''displays the full path name of the library file'''. You can open the standard library file to inspect, for example, the NMR experiment definitions that define which expected peaks are generated by FLYA. For instance, the definition for the HNCA spectrum (search for 'HNCA' in the library file 'cyana.lib') is


  SPECTRUM HNCA  HN N C
  SPECTRUM HNCA  HN N C
Line 107: Line 144:
   0.800  HN:H_AMI  N:N_AMI  (C_ALI) C_BYL  C:C_ALI  
   0.800  HN:H_AMI  N:N_AMI  (C_ALI) C_BYL  C:C_ALI  


The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and a label for the atoms that are detected in each dimension of the spectrum. The number of labels defines the dimensionality of the experiment (3 in case of HNCA).
The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and identifies the atoms that are detected in each dimension of the spectrum. The number of identifiers defines the dimensionality of the experiment (3 in case of HNCA).


Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of HNCA there are two lines, corresponding to the intraresidual and sequential peak. For instance, the definition for the intraresidual peak starts with the probability to observe the peak (0.980), followed by a series of atom types, e.g. H_AMI for amide proton etc. An expected peak is generated for each molecular fragment in which these atom types occur connected by single covalent bonds. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for HNCA 'HN:', 'N:', and 'C:'.
Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of HNCA there are two lines, corresponding to the intraresidual and sequential peak. For instance, the definition for the intraresidual peak starts with the probability to observe the peak (0.980), followed by a series of atom types, e.g. H_AMI for amide proton etc. An expected peak is generated for each molecular fragment in which these atom types occur connected by single covalent bonds. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for HNCA 'HN:', 'N:', and 'C:'.


'''Exercise 1: Determine the appropriate spectrum type for HCCCHTOCSY'''
=== Exercise 1: Determine the spectrum type ===
 
For the HCCCHToscy, determine the spectrum type and put the definition in the HCCCHTocsy.peaks file with the appropriate syntax.
 
The experiment is a TOCSY, a through-bond experiment. It allows you to see, in this case, from the backbone all the way out into the side chains.
 
* Use the less command (to view files in the terminal but not change) to search the spectrum type in the 'cyana.lib' file.
 
'''Hint:''' Look at the definitions themselves and not just the SPECTRUM names, to determine which TOCSY is the appropriate one.
Take the experiment with the most through-bond transfers.
* work in the copy of the data directory ('cd flyabb')
* Use a graphic text editor (or if you feel comfortable, use the vi terminal text editor) to manipulate the HCCCHTOCSY.peaks file and enter the appropriate spectrum type and identifiers in the correct order.
* '''Check that the order of the dimensions in your SPECTRUM definition matches the actual experiment.'''
 
Depending on how the xeasy peak file was generated, the order of the dimensions does not have to match the way experiments are recorded, and of course they do not in general match the SPECTRUM definition given in the 'cyana.lib' file. If it does not match, do not change it in the 'cyana.lib' file but switch the order of the atom labels it in the definition you add to your '*.peaks' file.
 
 
'''Hint:''' A quick determination of the order of the dimensions and  atom types, can be done by looking at the columns of the chemical shifts and detect the chemical shift patterns.
 
This may be harder than it sounds at first instance, take your time to detect the pattern if it is not immediately obvious to you.
 
As you know, the chemical shifts of specific atom types and groups are quite distinct. If you need help with the chemical shift statistics, go to: [http://www.bmrb.wisc.edu/ref_info/stats.php?set=filt&restype=aa&output=html BMRB]
 
'''Hint:''' For information on how to use the vi terminal editor: [https://www.cs.colostate.edu/helpdocs/vi.html vi editor]
 
=== Exercise 2: Run FLYA ===


use the less command to search the spectrum type in the cyana.lib file
* work in the copy of the data directory ('cd flyabb')


copy the spectrum type definition
Using the text editor of your choice, create your 'init.cya' macro as outlined ('''The init macro''') and also your 'CALC.cya' macro ('''The FLYA CALC macro''') to run FLYA. Be extra careful to avoid typos and unwanted spaces in coma lists etc.


use the vi editor to open the HCCCHTOCSY.peaks file and paste the appropriate spectrum type
==== Execution scripts or "macros" in CYANA ====


'''Hint:''' For information on how to use less and the vi editor visit:
For more complex task within CYANA, rather than to enter the execution commands line by line at the CYANA prompt, the necessary commands are collected in a file named '*.cya'. Collecting the commands in macros has the added advantage, that the macros serve as a record allowing to reconstruct previous calculations.


less cyana.lib
==== The init macro ====
vi HCCCHTOCSY.peaks


== FLYA execution scripts ==
The initialization macro file has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands that read the CYANA library and the protein sequence:


Rather than enter the execution commands line by line at the cyana prompt, the necessary commands are collected in a file called xxx.cya.
rmsdrange:=15-111
cyanalib
read demoShort.seq


Therefore, CYANA scripts ("macros") 'CALC*.cya' contain the various commands to perform the required tasks.
The first line sets the appropriate rmsdrange, and the command 'cyanalib' reads the standard CYANA library. The next command reads the protein sequence.


For instance, 'CALCbackbone.cya' performs automated backbone resonance assignment. It starts with the specification of the names of the input peak lists:
The protein sequence is stored in three-letter code in the file 'demo.seq'.


peaks:=HNtrosy,trHNCA,HNCOCA,HNCACB,NTocsy,HCCCHTocsy,aro,cnoesy,nnoesy
==== The FLYA CALC macro ====


The peak list names are separated by commas (without blanks!). The files on disk have the file name extension .peaks, e.g. HNCA.peaks.
The 'CALC.cya' starts with the specification of the names of the input peak lists:


The commands above will use all available peak lists. You can choose any subset of them by modifying the 'peaks:=...' statement.  
peaks:=aro,cnoesy,nnoesy,HNtrosy,trHNCA,HNCOCA,HNCACB,NTocsy,HCCCHTocsy
 
The peak list names are separated by commas (without blanks!). The files on disk have the file name extension .peaks, e.g. 'HNCA.peaks'.
 
The command above will use all available peak lists. You can choose any subset of them by modifying the 'peaks:=...' statement.  


These are followed by tolerances for chemical shift matching:
These are followed by tolerances for chemical shift matching:
Line 146: Line 213:


In this case, a tolerance of 0.03 ppm will be used for protons, and 0.4 ppm for carbon and nitrogen.
In this case, a tolerance of 0.03 ppm will be used for protons, and 0.4 ppm for carbon and nitrogen.
The 'assigncs_accX' variables are used within flya, the tolerance variable is used in consolidation.


The next parameter specifies the seed value for the random number generator (an arbitrary positive integer is ok).  
The next parameter specifies the seed value for the random number generator (an arbitrary positive integer is ok).  
Line 151: Line 219:
  randomseed=101
  randomseed=101


FOR THE SAKE OF INFORMATION, WE DO NOT DO THIS
The next parameter chooses the "quick" optimization schedule in order to speed up the calculation for this practical:
Groups of atoms for which assignment statistics will be calculated and reported in the 'flya.txt' output file can be defined like this:
 
<!---
shiftassign_population=25 (we do not set this, so the default of XX is used)
-->
shiftassign_quick=.true.
 
In production runs, better results can be expected (at the expense of longer computation times) if this parameter is not set.
<!---
* The population size for the genetic algorithm, i.e. how many assignments form one generation (25; chosen smaller than in normal production runs in order to speed up the calculation) 
-->
The next command specifies that swapping of diastereotopic pairs is optimized with respect to the supplied reference assignment (only applied during consolidation):
 
shifts_consolidate_swap=.true.
 
Finally, there is the command to start the FLYA algorithm:


  analyzeassign_group := BB: N H CA CB C
  flya runs=10 assignpeaks=$peaks shiftreference=manREF.prot
 
Here, the given parameters of the 'flya' command specify the following:
 
* The number of independent runs of the algorithm, from which the consolidated shift will be calculated (chosen smaller than in normal production runs in order to speed up the calculation).
* The input peak lists that will be used (as defined above).
* An ensemble of random structures will be calculated for generating expected peaks (leads to prediction of short range NOES in NOESY-type experiments).
* The results will be compared with the reference chemical shifts in the file 'manREF.prot' (which have been determined by conventional methods).


The next commands restrict the generation of expected peaks to a subset of atoms, here the backbone atoms:
command select_atoms
  atom select "N H CA CB C"
end


In this case, the command defines a group called BB (a name that can be chosen freely) comprising the atoms N, H, CA, CB, C.
When you have prepared the 'init.cya' and the 'CALC.cya' start your FLYA ('CALC.cya') macro using 10 processors by calling it as outlined just below. It will take between 10-20 minutes (depending on your system) to complete the assignment, once the calculation starts.


Specific labeling can be handled in the same way, and peak list-specific atom selections can be applied as follows (not used in 'CALCbackbone.cya' but in 'CALClabeling.cya'):
To run the FLYA calculation, one could start CYANA and execute the 'CALC.cya' macro from the CYANA prompt, however on a computer with multiple processors it is better to speed up the calculation by running the 'CALC.cya' macro in parallel:


  command ''XXX''_select
  cyana -n 10 CALC.cya
  atoms select "..."
end


Two parameters of the assignment algorithm are set in order to speed up the calculation for this practical:
This starts 10 independent calculations on 10 processors by using the MPI scheduler (if installed on your system, otherwise shared memory will be used).


shiftassign_population=25
It is strongly recommended that you check the MPI scheduler if your calculation is running. If you made a mistake in one of the two macros, the calculation may or may not start, or get interrupted at some point.
shiftassign_quick=1


In production runs, better results can be expected (at the expense of longer computation times) if these parameters are not set. These parameters specify:
ls -ltr


* The population size for the genetic algorithm, i.e. how many assignments form one generation (25; chosen smaller than in normal production runs in order to speed up the calculation
This command allows you to check whether the calculation has generated intermediary files or final results, with a time stamp.
* An option to choose the "quick" optimization schedule.


Finally, there is the command to start the FLYA algorithm:
<!---
To check the queuing on the server use (PEND or if running you see the time elapsed and JOBID):


  flya runs=10 assignpeaks=$peaks structure= shiftreference=ref.prot
  squeue
            JOBID PARTITION    NAME    USER ST      TIME  NODES NODELIST(REASON)
              4414      bnmr    CALC    dxxxx  R      0:56      3 guri[1,5-6]


Here, the given parameters of the 'flya' command specify that
And to cancel the processes started by you before completion:


* The number of independent runs of the algorithm, from which the consolidated shift will be calculated (chosen smaller than in normal production runs in order to speed up the calculation).
scancel 4414
* The input peak lists that will be used (as defined above).
* No ensemble of random structures will be calculated for generating expected peaks (is only necessary for NOESY-type experiments).
* The results will be compared with the reference chemical shifts in the file 'ref.prot' (which have been determined independently by conventional methods). The reference chemical shifts will not be used by the algorithm but only for a subsequent analysis of its results.


== Run the FLYA calculation ==
--->
To check the general load on your local computer use:


To run the FLYA calculation, you start CYANA and execute the corresponding 'CALC*.cya' script. For instance:
top


cyana "nproc=5; CALCbackbone"
To kill all processes running (from you):


By specifying 'nproc=5', 5 independent runs of the algorithm will be performed in parallel. On a computer with multiple processors this will speed up the calculation, which is expected to take a few minutes.
skill -u <username>


== FLYA output files ==
=== FLYA output files ===


The FLYA algorithm will produce the following output files:
The FLYA algorithm will produce the following output files:
Line 210: Line 292:
* '''''XXX''_asn.peaks:''' Assigned peak list, corresponding to input peak list ''XXX''.peaks
* '''''XXX''_asn.peaks:''' Assigned peak list, corresponding to input peak list ''XXX''.peaks


=== The flya.txt file ===
==== The flya.txt file ====


This output file starts with overall assignment statistics for each group of atoms as defined by 'analyzeassign_group:=...' in CALCbackbone.cya':
This output file starts with overall assignment statistics for each group of atoms as defined by 'analyzeassign_group:=...':


     ____________________________________________________________
     ____________________________________________________________
Line 265: Line 347:
There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here).
There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here).


=== The flya.tab file ===
==== The flya.tab file ====


This file provides information about the chemical shift assignment of each individual atom:
This file provides information about the chemical shift assignment of each individual atom:
Line 300: Line 382:
** '''('''''atom name'''''):''' Correct assignment, if within the same residue (no residue number given), or the neighboring residues.
** '''('''''atom name'''''):''' Correct assignment, if within the same residue (no residue number given), or the neighboring residues.


=== The flya.pdf file ===
==== The flya.pdf file ====


This PDF file provides a graphical representation of the 'flya.tab' file. Each assignment for an atom is represented by a colored rectangle.  
This PDF file provides a graphical representation of the 'flya.tab' file. Each assignment for an atom is represented by a colored rectangle.  
[[Image:flyabackbone.png|thumb|600px|'''flya.pdf generated by CALCbackbone.cya''']]
[[Image:flyabackbone.png|thumb|600px|'''flya.pdf''']]


* '''Green:''' Assignment by FLYA agrees with the manually determined reference assignment (within tolerance)  
* '''Green:''' Assignment by FLYA agrees with the manually determined reference assignment (within tolerance)  
Line 312: Line 394:
Respective light colors indicate assignments not classified as strong by the chemical shift consolidation. The row labeled HN/Hα shows for each residue HN on the left and Hα in the center. The N/Cα/C’ row shows for each residue the N, Cα, and C’ assignments from left to right. The rows β-η show the side-chain assignments for the heavy atoms in the center and hydrogen atoms to the left and right. In the case of branched side-chains, the corresponding row is split into an upper part for one branch and a lower part for the other branch.
Respective light colors indicate assignments not classified as strong by the chemical shift consolidation. The row labeled HN/Hα shows for each residue HN on the left and Hα in the center. The N/Cα/C’ row shows for each residue the N, Cα, and C’ assignments from left to right. The rows β-η show the side-chain assignments for the heavy atoms in the center and hydrogen atoms to the left and right. In the case of branched side-chains, the corresponding row is split into an upper part for one branch and a lower part for the other branch.


== FLYA applications ==
=== Exercise 3: Analyze the FLYA results ===
 
*Analyze your FLYA results using 'less' or a graphical text editor and a pdf viewer.
 
*What do you think?
 
*How does the automated assignment compare to the provided assignment?
 
*How robust you think are the results?
 
*What could you do to likely improve the result?
 
'''Hint:''' Use the terminal command 'gs' to view pdf files (control-C to quit gracefully):
gs flya.pdf
 
== Using Talos to generate torsion angle restraints ==
 
Torsion angle restraints from the backbone chemical shifts help restrict angular conformation space. We wish to use only "strong assignments" to generate these restraints.
 
If you do not have TALOS installed get it from [https://www.ibbr.umd.edu/nmrpipe/install.html here]. It is part of the nmrpipe software package.
 
=== Exercise 4: Calculate backbone torsion angle restraints using Talos ===
 
'''Hint: ''' Copy the FLYA results into a new folder, since otherwise you will overwrite your original 'flya.prot' file.
 
Essentially you will need to copy the details directory and the 'flya.prot' file.


CYANA macros 'CALC*.cya' are provided for the following FLYA tasks:
cp -r flyabb acoPREP
cd acoPREP
rm *.peaks *.out *.job


=== CALC.cya: standard automated chemical shift assignment ===
Use a text editor of your choice to create a 'CALC.cya' file with the commands to calculate the talos angle restraints.


* specify list of input peak lists in variable 'peaks' without intervening blanks
TALOS is used to generate torsion angle restraints from the backbone chemical shifts in 'flya.prot'.
* specify tolerances for 1H, 13C, 15N with variables assigncs_assH, assigncs_assC assigncs_assN
* command 'select_atoms' excludes some nuclei that are difficult to detect
* optional parameter 'shiftreference=ref.prot' specifies reference chemical shift list, used only for comparison in flya.tab, flya.txt, flya.pdf


'''Note that the input data for this calculation contains two mistakes. Try to identify the problem by inpecting the 'flya.txt' file and the input files. Correct the mistakes and rerun the calculation before proceeding with other calculations!
consolidate reference=flya.prot file=flya.tab plot=flya.pdf prot=details/a[0-9][0-9][0-9].prot
'''


=== CALCbackbone.cya: standard backbone chemical shift assignment ===
This overwrites the original flya.prot with only strong assignments.
* parameter 'structure=' to avoid generation of random structures, which are not needed if using only through-bond spectra


=== CALCexperiments.cya (optional): using modified/new experiment definitions in library ===
read prot flya-strong.prot unknown=skip
* modified HCCHTOCSY only for aromatics (library HCCHTOCSY.lib, peak list HCCHTOCSYaro.peaks)
* new experiment N15NOESY2D (library peak list N15NOESY2D.lib, peak list N15NOESY2D.peaks)
talos talos=talos+               
talosaco pred.tab
write aco talos.aco
 
This will call the program TALOS+ and store the resulting torsion angle restraints in the file 'talos.aco'.
 
Since this is not a calculation suited for the MPI scheduler, start CYANA first, then call the 'CALC.cya' macro from the prompt.
 
<!---
'''Hint: ''' On stift (or kreide): change to a cshell and prepare nmrpipe by typing:
csh
prepare nmrpipe
--->
'''Hint: ''' change to a cshell before running cyana (since talos needs a cshell to run):
csh
 
== Automated NOESY assignment and structure calculation ==
 
We will perform an automated NOE restraint assignment and structure calculation by torsion angle dynamics.
 
The 'flya.prot' file from the automated resonance assignment will be used together with the (unassigned) NOESY peak lists to assign the NOESY peaks and to generate distance restraints. The structure is calculated in cycles, essentially testing the NOE assignment and iteratively refining it, in order to compute the three-dimensional structure of the protein.
 
=== Exercise 5: Run noeassign ===
 
Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight.
 
cp -r flyabb noebb
cd noebb
rm *asn.peaks *exp.peaks *.out *.job
rm -rf details
 
From the directory 'acoPREP' copy the calculated talos restraints ('talos.aco').
Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined.
 
==== The noeassign CALC macro ====
 
peaks:= cnoesy.peaks,nnoesy.peaks,aro.peaks
prot:= flya.prot                 
restraints:= talos.aco                   
tolerance:= 0.040,0.030,0.45           
structures := 100,20                     
steps:= 10000                     
randomseed:= 434726   
                 
noeassign peaks=$peaks prot=$prot autoaco
 
To speed up the calculation, you can set optionally in 'CALC.cya':
 
structures:=50,10
steps=5000
 
These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer.
If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept.
 
<!---Where as ????x torsion angle dynamics steps will be applied per conformer. --->
 
When you are done preparing the macros as outlined run the calculation.
 
The automated NOE assignment and structure calculation will be performed by running the 'CALC.cya' macro:
 
cyana -n 33 CALC.cya
 
Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC.cya'. 7 cycles of calculations will be preformed.
 
Statistics on the NOE assignment and the structure calculation will be in the file 'Table', which can also be produced with the command 'cyanatable -lp'.
 
The final structure will be 'final.pdb'. You can visualize it, for example, with the command
 
chimera final.pdb
 
The optimal residue range for superposition can be found with the command
 
cyana overlay final.pdb
 
Run noeassign with your 'CALC.cya' macro.
 
You can check the statistics (and success of 'noeassign') by running:
 
cyanatable
 
== Creating the ligand library file for CYANA ==
 
In the next three exercises you will create the ligand library file for CYANA from scratch. Do this carefully and check your result, otherwise your structure calculation will not work as intended.
 
=== Exercise 6: Drawing the molecule and obtaining the SMILES code ===
 
* make a copy of the libex and work in there (libexbb)
cp -r libex libexbb
cd libexbb
 
Go to the [http://zinc.docking.org/search/structure ZINC] website.
 
Click on the Structure tab and draw the molecule using the supplied drawing (LIG.png) of the compound as a guide.
Copy the SMILES code.
 
 
'''Hint:''' To look at the supplied image file in the terminal, use:
 
xdg-open LIG.png
 
=== Exercise 7: Converting the SMILES code to mol2 ===
 
*work in the copy of the libex directory ('cd libexbb')
 
There are many options and programs to do this, we outline two:
 
 
If you can use Avogadro (best):
 
For Mac OS download Avogadro from: [https://avogadro.cc/ Avogadro]
 
Build -- > Insert --> SMILES
 
Paste the SMILES code
 
Extensions -- > Optimize Geometry
 
Save as
 
--> LIG.mol2 (*.mol2)
 
 
 
If you have to use chimera:
 
(If you are on the linux server chimera is installed)
 
Tools --> Structure Editing  --> Build Structure
Start Structure
 
--> SMILES string 
 
set the Residue name to LIG (capital letters)
 
--> Apply
Save your mol2 file as: LIG.mol2
 
'''Now, there is one issue we have to take care of: The intermolecular NOE assignments have to match the ligand structure assignment, otherwise the intermolecular NOEs will be wrong.'''
 
Using the text editor of your choice, manually change the "UNL1" in your mol2 file to "LIG".
 
Then open the supplied demoLIG.pdb structure in chimera, as well as the created mol2 structure.
 
chimera demoLIG.pdb LIG.mol2
 
First check the geometry (especially the rings and the stereochemistry). If it is wrong fix it!
 
Using the text editor of your choice, or more conveniently using chimera change the proton names in your mol2 file to match those of the pdb.
 
'''Hint:'''  Overlay the two ligand structures in chimera.
 
Favorites --> Command Line
 
In the command line enter:
 
match #1 #0
 
Depending on how the models are loaded you may need to change the #? numbers.
To see the model number use the Favorites --> Model Panel.
 
If chimera complains about "Unequal numbers of atoms chosen for evaluation", delete the pseudo atoms of 'demoLIG.pdb' temporarily for the overlay. In the command line
sel: @Q @Q? @Q??
delete sel
 
To rename selected atoms (control click) in the command line:
setattr a name HX sel
 
Hovering over atoms will display their names!
 
=== Exercise 8: Converting the mol2 file to a lib file for CYANA ===
 
*work in the copy of the libex directory ('cd libexbb')
*unpack the tool to convert the mol2 to a *.lib file
 
tar zxf cylib-2.0.tgz
 
run cylib with the options -nc -sc
 
./cylib-2.0/cylib -nc -sc LIG.mol2
 
this will create the LIG.lib file.
 
The -sc option keeps the angles of the rings fixed. We can do this since they are in this molecule either aromatic or have sp3 conjugated carbons in them, fixing the ring geometry.
If they had to be flexible, you would need to keep the angeles flexible and supply additional restraints to close the rings.
 
To test the lib file we need CYANA:
 
Create a sequence file containing 'LIG 333' and name the file 'LIG.seq'.
 
Start CYANA
This will read the CYANA library file correctlly but give you the error:
 
*** ERROR: Illegal residue name "LIG".
*** ERROR: Cannot read line 1:
                      LIG    333
 
Because we do not have an init file and have not read the 'LIG.lib' file yet, the program just tries to read the default sequence file in the directory, but the ligand is not yet in the library, so it fails...
 
read lib LIG.lib append
read seq LIG.seq
anneal
atoms select "* - &DUMMY"
pseudo=1
write pdb test.pdb selected
 
the command pseudo=1 ensures that the pseudo atoms will be in the written pdb file, 'atoms select "* - &DUMMY"' followed by 'write *.pdb selected' prevents the dummy atoms of the linker to be written to pdb.
 
 
'''Hint:'''  Since you might have to do this a few times, until the library is working and correct, it might be worthwhile to create a 'init.cya' and a 'CALC.cya' macro with the respective commands. This to speed things up and prevent the error output shown above.
 
 
Carefully analyze the WARNING and ERROR messages if any.
 
Then take a look at your lig.pdb in chimera and check that the chemistry and bonds are all as expected (ring closure!)
 
chimera test.pdb
 
Again overlay the 'LIG.pdb' with the provided 'demoLIG.pdb'.
 
If there are any issues "go back to the drawing board" to fix the issues.
Carefully check the names also of the pseudo atom names, since they are used in intermolecular-NOEs later.
 
To help find problems, you may use the command:
 
write lib LIG.lib names
 
This will write the library file containing actual atom names rather than numbers.
 
=== Alternative Exercise 6-8: Converting a pdb file to a lib file for CYANA ===
 
In case you were unsuccessful with exercises 6-8 in terms of getting a working ligand library file, do not dispair!
There is an easy workaround that you may be able to use in the real case as well, converting a pdb file to a library file for CYANA.
 
Use Avogadro:
 
File --> Open
 
Open the 5c5aLig.pdb
 
Save as
 
LIG.mol2 (*.mol2)
 
Rename the Residue to LIG in the LIG.mol2 file.
 
  ./cylib-2.0/cylib -nc -sc LIG.mol2
 
Done!
 
You can run the tests outlined above, using '''anneal''' etc to test your library file.
 
== Calculating the structure of the protein-ligand complex ==
 
=== Exercise 9: (Semi-automatic) Intermolecular cross peaks assignment and structure calculation ===
 
Since the molecular system contains protein and ligand, CYANA has to read the 'LIG.lib' file in addition to the regular 'cyana.lib' file.
The sequence file needs to contain the protein and the ligand (and a linker to connect the two).
 
Copy the noebb directory and give it the name noecc, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.
 
cp -r noebb noecc
cd noecc
rm *cycle* *.out *.job final* rama*
 
Update the 'init.cya' file in order to read the ligand library file and the sequence file containing the linker and the ligand.
 
Ad the 'read lib LIG.lib append' following the 'cyanalib' read command but before reading the sequence. 'append' is necessary, otherwise the 'cyana.lib' file will be overwritten by the 'LIG.lib' file.
 
rmsdrange:=15-111,333
cyanalib
read lib LIG.lib append
read seq demoLong.seq
 
Intermolecular cross peaks we assign by supplying noeassign an intermolecular xeasy peak list with just the ligand resonances assigned.
The ligand resonance were assigned manually and determined from an additional set of experiments (the semi-automatic part).
Thereby the resonance assignment matches the ligand atom assignment in the library file created in the previous exercise.
 
The protein side will then be assigned by noeassign.
 
Update the your previous 'CALC.cya' macro by adding the intermol-NOEs.peaks to the peaks list and adding the keep=all option to the noeassign command:
 
peaks:= cnoesy.peaks,nnoesy.peaks,aro.peaks,intermol-NOEs.peaks
prot:= flya.prot                 
restraints:= talos.aco                   
tolerance:= 0.040,0.030,0.45           
structures := 100,20                     
steps:= 10000                     
randomseed:= 434726 
write_peaks_names=.true.
assign_noartifact:="** list=intermol-NOEs.peaks"
noeassign peaks=$peaks prot=$prot keep=all selectcombine="* - @LIG" autoaco
 
The command 'assign_noartifact' effectivly disables network anchoring tests for the ligand. Since the list supplied is cleaned and presumed artifact free, we are allowed to do this.
We therby encourage the use of the intermolecular NEOs even if the support by other nearby NOEs is weak.
The command 'write_peaks_names=.true.' ensures that the assigned peak list are written to file with the actual resonance names (this is not xeasy standard).
 
You can run the calculation again, commenting out (#) the 'assign_noartifact' command, and see the effect on the final structure.
 
'selectcombine' calls for testing for errors to be done different:
Intermolecular peaks do not have to compete with intra protein peaks.
 
Run the calculation:
cyana -n 33 CALC.cya
 
== Comparing the calculated NMR structure to an XRAY reference structure ==
 
 
=== Exercise 10: Compare the NMR structure to the Xray structure ===
 
Download (www.rcsb.org) the xray structure with ID: 5c5a
 
Use either a web-browser or the terminal:
 
wget <nowiki>'https://files.rcsb.org/download/5c5a.pdb'</nowiki>
 
Using chimera it is possible to compare two structures, by overlaying and inspecting visually.
 
When you have your xray structure ready, load your calculated nmr structure and the xray structure in chimera.
 
Use to chimera specific commands to overlay the two structures and compare the structures visually.
 
=== Exercise 11: Preparing an xray structure to use within CYANA ===
 
Deposited structures often lack specific features. i.e. Xray structures usually lack proton coordinates.
 
Copy your noecc results to a new directory call regulabb, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.
 
cp -r noecc regulabb
cd regulabb
rm *cycle* *.out *.job
 
After reading the sequence file, the pdb file can be read with the option unknown=warn or unknown=skip, this will then skip the parts of the molecule not specified in the sequence file.
 
read pdb xxxx.pdb unknown=warn
 
Other options to read pdb's:
 
read 5c5a.pdb unknown=warn hetatm new
 
where the option 'hetatm' allows for reading of coordinate labeled HETATM, rather than ATOM in the pdb. 'new' will read the sequence from the pdb.
 
To write back out pdb's and sequences:
write pdb XXX.pdb
write seq XXX.seq 
 
Inspect the pdb using chimera:
Now, there are several issues besides HETATM, that make the comparison to the calculated NMR structure not possible within CYANA before you fix them.
You may use a graphical text editor to fix them. In the end, you need to have a conformer of the complex ready to compare with the calculated NMR structure.
 
Best would be to practice the use of the 'regularize' command as well. This is however not really necessary in this particular case, since this xray structure contains proton coordinates.
Using the regularize command one can get a structure calculated within CYANA that has these features but still is very close to the input structure of your choice.
 
Copy your 'LIG.lib' file and name it 'NUT.lib', in the 'NUT.lib' file change the residue name from LIG to NUT.
The 'NUT.lib' file is necessary to read the original xray structure with ligand into CYANA.
 
Copy the 'demoLong.seq' file and name it 'demoLongEd.seq', in the 'demoLongEd.seq' file delete the linker residues.
 
Create an 'init.cya' macro with:
cyanalib
read lib LIG.lib append
 
Then create a 'CALC_reg.cya' macro with:
read lib NUT.lib append
read 5c5a.pdb unknown=warn hetatm new
write 5c5a_Ed.seq
write 5c5a_Ed.pdb
#renumber and rename the ligand from 201 333, NUT to LIG
library rename "@NUT" residue=LIG
atoms select @LIG
atoms set residue=333
write 5c5a_renum.seq
write 5c5a_renum.pdb
#sequence with ligand but without linker
read demoLongEd.seq
read 5c5a_renum.pdb rigid unknown=warn
write XrayAChainRenum.pdb
initialize
read seq demoLong.seq
read pdb XrayAChainRenum.pdb unknown=warn
write pdb test.pdb
read pdb test.pdb
regularize steps=20000 link=LL keep
 
Execute the 'CALC_reg.cya' macro in the CYANA shell (or use only one processor, do not distribute the job):
 
cyana CALC_reg.cya
 
=== Exercise 12: Calclulate the RMSD of NMR vs. xray structure using a CYANA macro ===
 
Using the INCLAN language of CYANA ([[Writing and using INCLAN macros]],[[Using INCLAN variables]],[[Using INCLAN control statements]]) it is possible to write complex macros that interact with the FORTRAN code of CYANA. Reading internal variables and manipulating them to achieves custom task.
 
* save the manually edited xray structure (exercise 11) or the the regularized xray structure (containing the ligand and called 'regula.pdb') as 'reg_xray.pdb' to use the macro below (or change the name in the macro accordingly).
* what do you think about the RMSD, does the value make sense? Does the range make sense?
 
Below you find the commands for a macro (call it 'CALC_RMSD.cya') that will read the regularized xray structure and the calculated nmr structure, then calculating the rmsd of both the protein and ligand parts of the complex:
 
read demoLong.seq
rmsd range=15-111 structure=final.pdb reference=reg_xray.pdb
atom select "BACKBONE 15-111"
t=rmsdmean
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1) then
    n=n+1
    s=s+displacement(i)
    end if
end do
print "RMSD of the LIG: ${s/n} ($n atoms)"
read pdb final.pdb
structure mean
write pdb mean.pdb
read pdb mean.pdb
read pdb reg_xray.pdb append
atom select "BACKBONE 15-111"
t=rmsdmean
atom select "WITHCOORDALL"
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1.and.asel(i)) then
    n=n+1
    s=s+displacement(i)*2
  end if
end do
print "Displacement of the LIG (to ref xray): ${s/n} ($n atoms)"
 
== Beyond The Basics: Improving the final structure ==


=== CALCexpfromlist.cya (optional): read expected peaks from a peak list ===
* command N15NOESY_expect, reading input peak list N15NOESY_in.peaks


=== CALCfixedpeaks.cya (optional): keep input peak assignments in user peak assignments ===
=== FLYA options ===
* (partially) assigned input peak list N15HSQCassigned.peaks
* parameter 'keepassigned' for loadspectra.cya


=== CALCfixedshifts.cya: fix input chemical shift assignments ===
There are a variety of commands to modify FLYA runs to accommodate experimental labeling schemes or apply previous assignments etc...
* input chemical shift list 'fix.prot'
* shift error in chemical shift list specifies range for assignment


=== CALClabeling.cya: use of experiment-specific isotope labeling ===
'''Modify the chemical shift statistics used for assignment'''
* command 'select_atoms' for general selection of assignable nuclei CcoNH + HSQCLEULYS
* command '<peak list name>_select' with atom selection for a specific peak list (e.g. C13HSQC_LK.peaks)
* command '<peak list name>_expect' for non-standard generation of expected peaks for a given peak list (e.g. CcoNH_LK.peaks with dimension-specific atom selection)


=== CALCnoesyonly.cya: chemical shift assignment using exclusively NOESY ===
Supply user-defined chemical shift statistics instead of standard BMRB statistics from library and
*increased population size with 'shiftassign_population=200'
replace the general statistics from 'cyana.lib' (CSTABLE).
* see Schmidt et al. J. Biomol. NMR 57, 193-204 (2013)


<!--=== CALCquick.cya: fast automated chemical shift assignment ===
* fixed number of generations in evolutionary optimization
-->
=== CALCstatistics.cya: user-defined chemical shift statistics instead of standard BMRB statistics from library ===
* average value and stddev from input chemical shift list 'shiftx.prot'
* average value and stddev from input chemical shift list 'shiftx.prot'
* 'assigncs_sd:=bmrb' to use stddev from BMRB (cyana.lib) instead of input chemical shift list
* 'assigncs_sd:=bmrb' to use stddev from BMRB ('cyana.lib') instead of input chemical shift list
* 'assigncs_sdfactor:=0.5' to scale BMRB stddev by given factor
* 'assigncs_sdfactor:=0.5' to scale BMRB stddev by given factor


=== CALCstructcalc.cya: follow automated shift assignment by automated NOESY assignment and structure calculation ===
shiftassign_statistics:=predicted.prot
* peak lists for distance restraint generation specified by parameter 'structurepeaks='
 
 
'''Modify the reported statistics'''
 
Groups of atoms for which assignment statistics will be calculated and reported in the 'flya.txt' output file can be defined as:
 
analyzeassign_group := BB: N H CA CB C
 
In this case, the command defines a group called BB (a name that can be chosen freely) comprising the atoms N, H, CA, CB, C.
 
 
The optional parameter 'shiftreference=manREF.prot' specifies reference chemical shift list, used only for comparison in flya.tab, flya.txt, flya.pdf:
 
  shiftassign_reference:=manREF.prot
 
The same parameter may also be set as part of the flya command:


=== CALCstructure.cya: use input structure to generate expected peaks for through-space experiments ===
flya runs=10 assignpeaks=$peaks shiftreference=manREF.prot
* specify with parameter 'structure' of command 'flya'
 
 
'''Modify the expected peak lists'''
 
Specific labeling can be handled and peak list-specific atom selections can be applied.
 
To restrict the generation of expected peaks to a subset of atoms, here the backbone atoms:
 
command select_atoms
  atom select "N H CA CB C"
end
 
 
Input structures may be used to generate expected peaks for through-space experiments:
* specify with parameter 'structure' of the command 'flya'
* if parameter 'structure' is absent, a set of random structures is generated automatically
* if parameter 'structure' is absent, a set of random structures is generated automatically
* if set to blank ('structure='), no random structures are generated (if not needed because only through-bond spectra are used)
* if set to blank ('structure='), no random structures are generated (if not needed because only through-bond spectra are used)


<!--
flya runs=10 assignpeaks=$peaks structure=XXX.pdb
== Using input chemical shifts: shift predictions or partial assignments (optional) ==


Input chemical shift can be used in three ways.
Experimental peaks may also be employed as expected peak lists:
* command N15NOESY_expect, reading input peak list N15NOESY_in.peaks
N15NOESY_expect :=N15NOESY_in


These shifts will only be used for comparison (e.g. in flya.tab, flya.txt, flya.pdf):


shiftassign_reference:=ref.prot
'''Keeping previously determined assignments'''


Shifts and standard deviations in the file 'predicted.prot' (not provided in this practical) will replace the general statistics from cyana.lib (CSTABLE):
To keep input peak assignments in user peak assignments:
* (partially) assigned input peak list XXX.peaks
* parameter 'keepassigned' for 'loadspectra.cya'


  shiftassign_statistics:=predicted.prot
  loadspectra_keepassigned:=.true.
 
To fix input chemical shift assignments contained in a prot file


Shifts in the file 'fix.prot' will be fixed to the input values
To do this i.e for backbone atoms extracted from the manREF.prot list:


shiftassign_fix:=fix.prot
Make a list of only the reference backbone chemical shifts by entering the CYANA commands:


The latter approach can for instance be used to perform sidechain assignment when the backbone assignment is already known.  
read manREF.prot
  atom set "* - H N CA CB C" shift=none
write fix.prot
The file 'fix.prot' will contain the reference chemical shifts only for the backbone (and CB) atoms H, N, CA, CB, C'. Now you can repeat the assignment calculation by inserting the 'shiftassign_fix:=fix.prot' statement in 'CALC.cya' and choosing only the input peak lists that are relevant for sidechain assignment:


If you want to do this, copy the original data to a new directory:
shiftassign_fix:=fix.prot


cd ~/guentert
tar zxf Flyaembo.tgz
mv flyaembo flyasc
cd flyasc


Then make a list of only the reference backbone chemical shifts. Start CYANA. In CYANA, enter the commands
'''Chemical shift assignment using exclusively NOESY'''


read ref.prot
*increased population size with 'shiftassign_population=200'
atom set "* - H N CA CB C" shift=none
* see Schmidt et al. J. Biomol. NMR 57, 193-204 (2013)
write fix.prot
q


The file 'fix.prot' will contain the reference chemical shifts only for the backbone (and CB) atoms H, N, CA, CB, C'. Now you can repeat the assignment calculation by inserting the 'shiftassign_fix:=fix.prot' statement in 'ASSIGN.cya' and choosing only the input peak lists that are relevant for sidechain assignment:


shiftassign_fix:=fix.prot
'''Speeding up FLYA runs'''
noesy:=N15NOESY,C13NOESY
assignpeaks:=C13H1,N15H1,HCCH24,HCCH7,HBHACONH,C_CO_NH,HC_CO_NH


-->
Serves the fast automated chemical shift assignment and means the results in general are less accurate since either the populations are smaller, there are less parallel runs or the optimization schedule is modified.
<!--
== Fully automated structure calculation ==


Automated resonance assignment, automated NOE restraint assignment, and the structure calculation by torsion angle dynamics can be combined by running the 'flya' command in 'ASSIGN.cya' with the additional parameter 'stage=1':
In production runs, better results can be expected (at the expense of longer computation times) if these parameters are not set.


flya runs=10 shiftreference=ref.prot structurepeaks=$structurepeaks assignpeaks=$assignpeaks stage=1


The 'flya.prot' file from the automated resonance assignment will be used together with the (unassigned) NOESY peak lists to assign the NOESY peaks and to generate distance restraints in order to compute the three-dimensional structure of the protein.  
There are three parameters of the assignment algorithm that can be set in order to speed up the calculation.


To speed up the calculation, you can set in 'ASSIGN.cya' (above the 'flya' command):
Fixed number of generations in evolutionary optimization:


  structures:=25,5
  shiftassign_population=25
steps=4000


These commands tell the program to calculate, in each cycle, 25 conformers, and to analyze the best 5 of them. 4000 torsion angle dynamics steps will be applied per conformer.
The population size for the genetic algorithm, i.e. how many assignments form one generation (25; chosen smaller than in normal production runs in order to speed up the calculation).


7 cycle of automated NOE assignment and structure calculation will be performed. Statistics on the NOE assignment and the structure calculation will be in the file 'Table', which can also be produced with the command 'cyanatable -lp'.
There is also an option to choose the "quick" optimization schedule:


The final structure will be 'final.pdb'. <!--You can visualize it, for example, with the command
shiftassign_quick=.true.


molmol -r 8-110 final.pdb
And last the 'runs' option can be set for flya as we did in the exercise ('flya runs=10').


The optimal residue range for superposition can be found with the command
<!--_
* peak lists for distance restraint generation specified by parameter 'structurepeaks='  (used to pass on from flya to noeasign if done in one go)
--->


cyana overlay final.pdb
=== neoassign options ===


or with the [http://www.bpc.uni-frankfurt.de/cyrange.html CYRANGE web server].
To learn more about noeassign consult the tutorial [[Structure calculation with automated NOESY assignment]]. Other options for neoassign are described here: [[CYANA_Macro:_noeassign]]
-->


== Automated NOESY assignment and structure calculation ==
=== Exercise 13: Mapping restraints onto a known structure ===


Automated NOE restraint assignment and the structure calculation by torsion angle dynamics is included in 'CALCstructcalc.cya' (see above).
One can map the calculated restraints, such as distance restraints (upl/lol) onto a known structure (in the example here an xray structure). This is another approach to analyze restraints and their influence on the results.


Alternatively, you can also perform automated NOE restraint assignment and the structure calculation separately with the 'CALCauto.cya' macro. The 'flya.prot' file from the automated resonance assignment (e.g. with 'CALC.cya'; backbone and side-chain assignments are required!) will be used together with the (unassigned) NOESY peak lists to assign the NOESY peaks and to generate distance restraints in order to compute the three-dimensional structure of the protein.  
Below you find the commands to accomplish this. You see by studying the commands, which files are needed to execute the macro. Therefore, create a new directory ('mkdir') or copy a directory containing the respective files. Delete what you do not need. Use the regularized xray structure from exercise 11.


TALOS-N can be used to generate torsion angle restraints from the backbone chemical shifts in 'flya.prot'. To do this, use the CYANA commands
Commands preceded by hashtags (#) are commented out, remove the hashtags if you want to use them. If you decide to use the intermo-NOEx-cycle7.peaks file, make sure to comment any commands you no longer need.


read flya.prot
You need an init file:
talos
write talos.aco


This will call the program TALOS-N and store the resulting torsion angle restraints in the file 'talos.aco'.
rmsdrange:=15-111,333
cyanalib
read lib LIG.lib append


For further information about automated NOESY assignment you can consult the Tutorial [[Structure calculation with automated NOESY assignment]] (which uses different file names than we have here).
And the main macro (name it 'CALC_xraymap.cya'):


To speed up the calculation, you can set in 'CALCauto.cya':
read seq demoLong.seq


  structures:=50,10
The following block of commands, takes the assigned intermol.peaks list and calculates distance restraints from the peak intensities:
  steps=5000
  #peaks:=intermol-NOEs-cycle7.peaks
  #calibration peaks=$peaks
#peaks calibrate simple
#write upl intermol.upl


These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer.
The following block of commands, reads the 'final.upl' list (in this case of neoassign) and selects the intermolecular NOEs to LIG and writes them to file:
read upl final.upl
distance select "*, @LIG" info=full
write intermol.upl


7 cycle of automated NOE assignment and structure calculation will be performed by running the command
read intermol.upl unknown=warn
#read upl lig.upl append
#read lol lig.lol
read regula.pdb unknown=warn
weight_vdw=0
overview intermol_xray.ovw


cyana "nproc=4; CALC"
*If the restraints do not match with the xray structure, does it mean they are wrong?
*If you tried the two options, what is (are) the difference(s)?
*Did you look at the LIG.upl/lol files in the demo_data folder, what are they? What type of NMR experiments are there to obtain them?


Statistics on the NOE assignment and the structure calculation will be in the file 'Table', which can also be produced with the command 'cyanatable -lp'.
=== Exercise 14: Work on improving the final structure ===


The final structure will be 'final.pdb'. You can visualize it, for example, with the command
Using what you have learned so far, employing some of the options of FLYA and noeassign, consider if it is possible to improve the resolution of the final structure.


pymol final.pdb


The optimal residue range for superposition can be found with the command
General questions to answer regarding this task:
*Name additional experimental restraints (or inputs) you could use for structure calculation.
*Name additional NMR experiments you could measure, to acquire experimental data that are not supplied with the demo_data.


cyana overlay final.pdb


or with the [http://www.bpc.uni-frankfurt.de/cyrange.html CYRANGE web server]. To superimpose the structures in PyMOL, you can use the internal PYMOL command with the appropriate residue range:
<!---
For a structure calculation starting from given experimental restraints use the commands below:


intra_fit resi 8-110
The 'init.cya' macro:
set all_states


<!--
rmsdrange:=15-111,333
=== Download results of fully automated structure calculation ===
cyanalib
read lib LIG.lib append


If you cannot complete the fully automated structure calculation but want to look at the results that have been calculated previously, you may download them [[Media:flyaemboresults.tgz|here]] (about 24 MB).
The 'CALC.cya' macro:
-->
read seq demoLong.seq
read aco talos.aco
read upl final.upl
#read upl lig.upl append
#read lol lig.lol
randomseed:= 434726
calc_all structures=100 command=anneal steps=10000 
overview calc.ovw structures=20 pdb
--->

Latest revision as of 17:33, 1 March 2018

In this tutorial we will determine the resonance assignments and the structure of a protein-ligand complex using modules of CYANA.

To this end we will first run the CYANA module FLYA to obtain the resonance assignments from backbone, side chain and NOESY experiments (actually, the XEASY peak lists of these experiments).

Then we will use noeassign to assign the NOESY spectra and calculate the holo protein structure without the ligand.

In a next step we will first draw the ligand, convert the obtained SMILES code to a *.mol2 file and generate the *.lib file for CYANA.

Then we will assign intermolecular peaks lists and redo the structure calculation, this time of the protein-ligand complex.

To finalize you will compare the calculated NMR structure to an X-ray structure and generate statistics.

And ultimately you can try to improve your structure results by studying and applying the options available within the FLYA and noeassign modules of CYANA.


CYANA setup for the AUREMN Practical NMR Course in Campino (24-26 February 2018)

Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):

  1. Go to your home directory (or data directory).
  2. Get the data for the practical from the server (AUREMN2018.tgz).
  3. Unpack the input data for the practical.
  4. Get the demo version of CYANA for this practical.
  5. Unpack CYANA.
  6. Setup the CYANA environment variables.
  7. Change into the newly created directory 'AUREMN2018'.
  8. Copy the demo_data directory to 'flyabb'.
  9. Change into the subdirectory 'flyabb'.
  10. Test whether CYANA can be started by typing its name, 'cyana'.
  11. Exit from CYANA by typing 'q' or 'quit'.
  12. Download Chimera (to your personal laptop) from: Chimera
  13. Download Avogadro (to your personal laptop) from: Avogadro
cd ~
cp /home/julien/AUREMN2018.tar.gz .
tar zxf AUREMN2018.tar.gz 

wget 'http://www.cyana.org/wiki/images/6/64/Cyana-3.98bin-180213Demo.tgz'
tar zxf Cyana-3.98bin-180213Demo.tgz
cd cyana-3.98/
./setup

cd ~

cd AUREMN2018
cp -r demo_data flyabb 
cd flyabb


cyana
___________________________________________________________________

CYANA 3.98 (mac-intel)

Copyright (c) 2002-17 Peter Guentert. All rights reserved.
___________________________________________________________________

    Demo license valid for specific sequences until 2018-12-31

    Library file "/Users/deans/cyana-3.98/lib/cyana.lib" read, 41 residue types.
*** ERROR: Illegal residue name "LIG".
*** ERROR: Cannot read line 114:
           LIG  333
cyana> q

If all worked, you are ready to go in terms of everything related to CYANA! The reason you see the ERROR message, is because you have a sequence file in the directory, but not a library file for the ligand yet. Don't worry, this is as expected and you will take care of it during the exercise.

If you want to return to your practical later, using your own Linux or Mac OS X computer, you can download the demo version of CYANA from here.

Hint: More information on the CYANA commands etc. is in the CYANA 3.0 Reference Manual.

Automated resonance assignment

Resonance assignment within cyana is done using the module FLYA.

In the most general sense, there are two type of experiments used for protein resonance assignments. Through bond, TOSCY kind of experiments and through space NOESY type of experiments. Each of these two experiments carries distinct information that help the resonance assignment. The HSQC, HMQC or TROSY elements of these experiments merely help the resolution, by allowing the separation of resonances according to spin types (1H, 13C, 15N) into additional dimensions.

At the very minimum, for small systems and in favorable cases, a NOESY experiment may be sufficient to get an assignment and enough distance restraints for a structure calculation.

Experimental input data

Spectra are processed and referenced relative to each other. Peak lists in XEASY format are prepared by automatic peak picking with a visualization program such as CcpNmr Analysis, NMRdraw or NMRview and saved as XXX.peaks, where XXX denotes the name of the xeasy peak list file. Then they are cleaned (unnecessary water and noise peaks removed).

As part of the data supplied for the exercises, experimental peak lists are available for the following spectra:

  • HNtrosy (spectrum type 'N15HSQC' in the CYANA library)
  • trHNCA (spectrum type 'HNCA' in the CYANA library)
  • HNCOCA (spectrum type 'HNcoCA' in the CYANA library)
  • HNCACB (spectrum type 'CBCANH' in the CYANA library)
  • HCCCHTOCSY (the spectrum type will have to be determined in the first exercise)
  • NTOCSY (spectrum type 'N15TOCSY' in the CYANA library)
  • 3D 13C-resolved NOESY called aro (spectrum type 'C13NOESY' in the CYANA library)
  • 3D 13C-resolved NOESY called cnoesy (spectrum type 'C13NOESY' in the CYANA library)
  • 3D 15N-resolved NOESY called nnoesy (spectrum type 'N15NOESY' in the CYANA library)

Each peak list starts with a header that defines the experiment type and the order of dimensions. For instance, for HNCA.peaks:

# Number of dimensions 3
#FORMAT xeasy3D
#INAME 1 HN
#INAME 2 C
#INAME 3 N
#SPECTRUM HNCA HN C N
      5   6.475  58.033  98.548 1 U   2.769E+02  0.000E+00 e 0     0     0     0
      6   6.476  62.123  98.126 1 U   2.571E+01  0.000E+00 e 0     0     0     0
      7   6.475  54.017  98.159 1 U   2.547E+01  0.000E+00 e 0     0     0     0

The first line specifies the number of dimensions (3 in this case). The '#SPECTRUM' (no space between characters) lines gives the experiment type (HNCA, which refers to the corresponding experiment definition in the CYANA library), followed by an identifier for each dimension of the peak list (HN C N) that specifies which chemical shift is stored in the corresponding dimension of the peak list. The experiment type and identifiers must correspond to an experiment definition in the general CYANA library (see below). If a definition is missing for an experiment it must be added to the CYANA library. After the '#SPECTRUM' line follows one line for every peak. For example, the first peak in the 'HNCA.peaks' list has

  • Peak number 5
  • HN chemical shift 6.475 ppm
  • C (CA) chemical shift 58.033 ppm
  • N chemical shift 98.548 ppm

The other data are irrelevant for automated chemical shift assignment with FLYA. In particular, the peak volume or intensity (2.769E+02) is not used by the algorithm.

Hint: The formats of other CYANA files are described in the CYANA 3.0 Reference Manual.


The protein sequence is supplied by three-letter code in a XXX.seq file.

As part of the supplied data for the exercises there are two sequences:

  • demoShort.seq (the protein sequence alone)
  • demoLong.seq (the protein sequence, ligand and a linker that connects the two molecules)

Linker sequences serve to keep two or more molecules close in coordinate space during calculations, is usually between 15-20 elements long and is composed of dummy atoms that allow the linking.

SPECTRUM definitions in the CYANA library

When you start CYANA, the program reads the library and displays the full path name of the library file. You can open the standard library file to inspect, for example, the NMR experiment definitions that define which expected peaks are generated by FLYA. For instance, the definition for the HNCA spectrum (search for 'HNCA' in the library file 'cyana.lib') is

SPECTRUM HNCA  HN N C
 0.980  HN:H_AMI  N:N_AM*  C:C_ALI  C_BYL
 0.800  HN:H_AMI  N:N_AMI  (C_ALI) C_BYL  C:C_ALI 

The first line corresponds to the '#SPECTRUM' line in the peak list. It specifies the experiment name and identifies the atoms that are detected in each dimension of the spectrum. The number of identifiers defines the dimensionality of the experiment (3 in case of HNCA).

Each line below defines a (formal) magnetization transfer pathway that gives rise to an expected peak. in the case of HNCA there are two lines, corresponding to the intraresidual and sequential peak. For instance, the definition for the intraresidual peak starts with the probability to observe the peak (0.980), followed by a series of atom types, e.g. H_AMI for amide proton etc. An expected peak is generated for each molecular fragment in which these atom types occur connected by single covalent bonds. The atoms whose chemical shifts appear in the spectrum are identified by their labels followed by ':', e.g. for HNCA 'HN:', 'N:', and 'C:'.

Exercise 1: Determine the spectrum type

For the HCCCHToscy, determine the spectrum type and put the definition in the HCCCHTocsy.peaks file with the appropriate syntax.

The experiment is a TOCSY, a through-bond experiment. It allows you to see, in this case, from the backbone all the way out into the side chains.

  • Use the less command (to view files in the terminal but not change) to search the spectrum type in the 'cyana.lib' file.

Hint: Look at the definitions themselves and not just the SPECTRUM names, to determine which TOCSY is the appropriate one. Take the experiment with the most through-bond transfers.

  • work in the copy of the data directory ('cd flyabb')
  • Use a graphic text editor (or if you feel comfortable, use the vi terminal text editor) to manipulate the HCCCHTOCSY.peaks file and enter the appropriate spectrum type and identifiers in the correct order.
  • Check that the order of the dimensions in your SPECTRUM definition matches the actual experiment.

Depending on how the xeasy peak file was generated, the order of the dimensions does not have to match the way experiments are recorded, and of course they do not in general match the SPECTRUM definition given in the 'cyana.lib' file. If it does not match, do not change it in the 'cyana.lib' file but switch the order of the atom labels it in the definition you add to your '*.peaks' file.


Hint: A quick determination of the order of the dimensions and atom types, can be done by looking at the columns of the chemical shifts and detect the chemical shift patterns.

This may be harder than it sounds at first instance, take your time to detect the pattern if it is not immediately obvious to you.

As you know, the chemical shifts of specific atom types and groups are quite distinct. If you need help with the chemical shift statistics, go to: BMRB

Hint: For information on how to use the vi terminal editor: vi editor

Exercise 2: Run FLYA

  • work in the copy of the data directory ('cd flyabb')

Using the text editor of your choice, create your 'init.cya' macro as outlined (The init macro) and also your 'CALC.cya' macro (The FLYA CALC macro) to run FLYA. Be extra careful to avoid typos and unwanted spaces in coma lists etc.

Execution scripts or "macros" in CYANA

For more complex task within CYANA, rather than to enter the execution commands line by line at the CYANA prompt, the necessary commands are collected in a file named '*.cya'. Collecting the commands in macros has the added advantage, that the macros serve as a record allowing to reconstruct previous calculations.

The init macro

The initialization macro file has the fixed name 'init.cya' and is executed automatically each time CYANA is started. It can also be called any time one wants to reinitialize the program by typing 'init'. It contains normally at least two commands that read the CYANA library and the protein sequence:

rmsdrange:=15-111
cyanalib
read demoShort.seq

The first line sets the appropriate rmsdrange, and the command 'cyanalib' reads the standard CYANA library. The next command reads the protein sequence.

The protein sequence is stored in three-letter code in the file 'demo.seq'.

The FLYA CALC macro

The 'CALC.cya' starts with the specification of the names of the input peak lists:

peaks:=aro,cnoesy,nnoesy,HNtrosy,trHNCA,HNCOCA,HNCACB,NTocsy,HCCCHTocsy

The peak list names are separated by commas (without blanks!). The files on disk have the file name extension .peaks, e.g. 'HNCA.peaks'.

The command above will use all available peak lists. You can choose any subset of them by modifying the 'peaks:=...' statement.

These are followed by tolerances for chemical shift matching:

assigncs_accH=0.03
assigncs_accC=0.4
assigncs_accN=assigncs_accC
tolerance:=$assigncs_accH,$assigncs_accH,$assigncs_accC

In this case, a tolerance of 0.03 ppm will be used for protons, and 0.4 ppm for carbon and nitrogen. The 'assigncs_accX' variables are used within flya, the tolerance variable is used in consolidation.

The next parameter specifies the seed value for the random number generator (an arbitrary positive integer is ok).

randomseed=101

The next parameter chooses the "quick" optimization schedule in order to speed up the calculation for this practical:

shiftassign_quick=.true.

In production runs, better results can be expected (at the expense of longer computation times) if this parameter is not set. The next command specifies that swapping of diastereotopic pairs is optimized with respect to the supplied reference assignment (only applied during consolidation):

shifts_consolidate_swap=.true.

Finally, there is the command to start the FLYA algorithm:

flya runs=10 assignpeaks=$peaks shiftreference=manREF.prot

Here, the given parameters of the 'flya' command specify the following:

  • The number of independent runs of the algorithm, from which the consolidated shift will be calculated (chosen smaller than in normal production runs in order to speed up the calculation).
  • The input peak lists that will be used (as defined above).
  • An ensemble of random structures will be calculated for generating expected peaks (leads to prediction of short range NOES in NOESY-type experiments).
  • The results will be compared with the reference chemical shifts in the file 'manREF.prot' (which have been determined by conventional methods).


When you have prepared the 'init.cya' and the 'CALC.cya' start your FLYA ('CALC.cya') macro using 10 processors by calling it as outlined just below. It will take between 10-20 minutes (depending on your system) to complete the assignment, once the calculation starts.

To run the FLYA calculation, one could start CYANA and execute the 'CALC.cya' macro from the CYANA prompt, however on a computer with multiple processors it is better to speed up the calculation by running the 'CALC.cya' macro in parallel:

cyana -n 10 CALC.cya

This starts 10 independent calculations on 10 processors by using the MPI scheduler (if installed on your system, otherwise shared memory will be used).

It is strongly recommended that you check the MPI scheduler if your calculation is running. If you made a mistake in one of the two macros, the calculation may or may not start, or get interrupted at some point.

ls -ltr 

This command allows you to check whether the calculation has generated intermediary files or final results, with a time stamp.


To check the general load on your local computer use:

top

To kill all processes running (from you):

skill -u <username>

FLYA output files

The FLYA algorithm will produce the following output files:

  • flya.prot: Consensus assigned chemical shifts. This file contains a chemical shift for every atom that has been assigned to least one peak.
  • flya.tab: Table with details about the chemical shift assignment of each atom (comparison with reference shifts). In this file you can see for each atom whether the assignment is "strong" (self-consistent) or "weak" (only tentative).
  • flya.txt: Assignment statistics
  • flya.pdf: Graphical representation of the assignment results
  • XXX_exp.peaks: List of expected peaks, corresponding to input peak list XXX.peaks
  • XXX_asn.peaks: Assigned peak list, corresponding to input peak list XXX.peaks

The flya.txt file

This output file starts with overall assignment statistics for each group of atoms as defined by 'analyzeassign_group:=...':

   ____________________________________________________________

   CHEMICAL SHIFT ASSIGNMENT
   ____________________________________________________________

   SEED: 1
   chemical shifts for 542  atoms found
   Peaks assigned from frequencies

   BB: REFERENCES(2):512 CHEMICALSHIFTS(1):542 (1)and(2):512 MATCH:507(99.0% of (2))
  • REFERENCES(2) is the number of reference assignments (in the selected group)
  • CHEMICALSHIFTS(1) is is the number of atoms assigned by FLYA
  • (1)and(2) is the number of atoms that are assigned by FLYA and in the reference.
  • MATCH is the number of atoms with the same assignment by FLYA and in the reference. The percentage is relative to the number of reference assignments.

Further below comes a table with information about each peak list:

   PEAKLISTS
   #Expected: Total number of expected peaks
   noRef: Number of expected peaks with missing reference shifts
   noPeak: Number of expected peaks for wich no peak can be measured
   Assigned: Number of expected peaks that could be assigned
   Match: Number of assigned peaks that fit reference shifts
   #Measured: Total number of peaks in peak list
   Assigned: Number of measured peaks that could be assigned to expected peaks
   exp/meas: Ratio of assigned expected and measured peaks

   Lists      #Expected  noRef   noPeak   Assigned        Match    #Measured Assigned  exp/meas Assigned
   N15HSQC        106       8       1   104( 98.11%)    97( 91.51%)    131     96( 73.28%)   1.1
   HNCA           211      15      11   194( 91.94%)   186( 88.15%)    329    179( 54.41%)   1.1
   HNcaCO         211      15      11   197( 93.36%)   183( 86.73%)    246    176( 71.54%)   1.1
   HNCO           105       7       1   101( 96.19%)    97( 92.38%)    158     97( 61.39%)   1.0
   HNcoCA         105       7       0   101( 96.19%)    97( 92.38%)    158     99( 62.66%)   1.0
   CBCANH         399      26      25   361( 90.48%)   350( 87.72%)    623    339( 54.41%)   1.1
   CBCAcoNH       200      13       2   196( 98.00%)   185( 92.50%)    324    192( 59.26%)   1.0
   ALL           1337      91      51  1254( 93.79%)  1195( 89.38%)   1969   1178( 59.83%)   1.1

It contains the following data:

  • #Expected: Total number of expected peaks
  • noRef: Number of expected peaks with missing reference shifts
  • noPeak: Number of expected peaks for which no peak can be measured
  • Assigned: Number of expected peaks that could be assigned based on the reference chemical shift assignments. The theoretical maximum of 100% corresponds to the situation that the spectra “explain” all expected peaks. Each expected peak can be mapped to at most one measured peak. Remaining expected peaks correspond to missing peaks in the measured peak list.
  • Match: Number of assigned peaks that fit (within tolerance) reference shifts. The theoretical maximum of 100% corresponds to having all measured peaks assigned. Note that several expected peaks can be mapped to the same measured peak, i.e. the assignments of measured peaks can be unambiguous or ambiguous. Remaining unassigned measured peaks are likely to be artifacts.
  • #Measured: Total number of peaks in peak list
  • Assigned: Number of measured peaks that could be assigned to expected peaks
  • exp/meas: Ratio of assigned expected and measured peaks

There is more information on the results of the assignment calculation in the 'flya.txt' file (not described here).

The flya.tab file

This file provides information about the chemical shift assignment of each individual atom:

   Atom  Residue      Ref   Shift     Dev  Extent  inside   inref
   ...
   N     GLY   57 102.109 102.043   0.066    10.0   100.0   100.0  strong=
   H     GLY   57   8.571   8.570   0.001    10.0   100.0   100.0  strong=
   CA    GLY   57  45.415  45.433  -0.018    10.0   100.0   100.0  strong=
   HA2   GLY   57   4.042
   HA3   GLY   57   3.436
   C     GLY   57 173.621 173.662  -0.041    10.0    89.4    90.0  strong=
   N     LEU   58 120.640 120.649  -0.009    10.0    80.0    80.0  =
   H     LEU   58   7.488   7.492  -0.004    10.0    79.8    80.0  =
   CA    LEU   58  51.943  51.940   0.003    10.0    70.0    70.0  =
   HA    LEU   58   4.995
   CB    LEU   58  45.602  45.568   0.034    10.0    82.7    80.0  strong=
   CG    LEU   58  26.528
   HG    LEU   58   1.515
   CD1   LEU   58  24.745
   C     LEU   58 173.619 174.576  -0.957    10.0    40.1    10.0  ! (C 59)
   ...
  • Ref: Chemical shift value in the reference chemical shift list (ref.prot). It was not used in the calculation.
  • Shift: Consensus chemical shift value from FLYA
  • Dev = Ref - Shift
  • Extent: Number of runs in which the atom was assigned by FLYA.
  • Inside: Percentage of chemical shift values from the (10) independent runs of FLYA that agree (within the tolerance) with the consensus value.
  • inref: Percentage of chemical shift values from the (10) independent runs of FLYA that agree (within the tolerance) with the reference value.
  • Outcome of the assignment:
    • strong: "strong" assignment, i.e. Inside > 80%.
    • =: Assignment that agrees with reference, i.e. Dev < tolerance.
    • !: Assignment that does not agree with the reference, i.e. Dev > tolerance.
    • (atom name): Correct assignment, if within the same residue (no residue number given), or the neighboring residues.

The flya.pdf file

This PDF file provides a graphical representation of the 'flya.tab' file. Each assignment for an atom is represented by a colored rectangle.

flya.pdf
  • Green: Assignment by FLYA agrees with the manually determined reference assignment (within tolerance)
  • Red: Assignment by FLYA does not agree with the manually determined reference assignment
  • Blue: Assigned by FLYA but no reference available
  • Black: With reference assignment but not assigned by FLYA.

Respective light colors indicate assignments not classified as strong by the chemical shift consolidation. The row labeled HN/Hα shows for each residue HN on the left and Hα in the center. The N/Cα/C’ row shows for each residue the N, Cα, and C’ assignments from left to right. The rows β-η show the side-chain assignments for the heavy atoms in the center and hydrogen atoms to the left and right. In the case of branched side-chains, the corresponding row is split into an upper part for one branch and a lower part for the other branch.

Exercise 3: Analyze the FLYA results

  • Analyze your FLYA results using 'less' or a graphical text editor and a pdf viewer.
  • What do you think?
  • How does the automated assignment compare to the provided assignment?
  • How robust you think are the results?
  • What could you do to likely improve the result?

Hint: Use the terminal command 'gs' to view pdf files (control-C to quit gracefully):

gs flya.pdf

Using Talos to generate torsion angle restraints

Torsion angle restraints from the backbone chemical shifts help restrict angular conformation space. We wish to use only "strong assignments" to generate these restraints.

If you do not have TALOS installed get it from here. It is part of the nmrpipe software package.

Exercise 4: Calculate backbone torsion angle restraints using Talos

Hint: Copy the FLYA results into a new folder, since otherwise you will overwrite your original 'flya.prot' file.

Essentially you will need to copy the details directory and the 'flya.prot' file.

cp -r flyabb acoPREP
cd acoPREP
rm *.peaks *.out *.job

Use a text editor of your choice to create a 'CALC.cya' file with the commands to calculate the talos angle restraints.

TALOS is used to generate torsion angle restraints from the backbone chemical shifts in 'flya.prot'.

consolidate reference=flya.prot file=flya.tab plot=flya.pdf prot=details/a[0-9][0-9][0-9].prot

This overwrites the original flya.prot with only strong assignments.

read prot flya-strong.prot unknown=skip

talos talos=talos+                
talosaco pred.tab

write aco talos.aco

This will call the program TALOS+ and store the resulting torsion angle restraints in the file 'talos.aco'.

Since this is not a calculation suited for the MPI scheduler, start CYANA first, then call the 'CALC.cya' macro from the prompt.

Hint: change to a cshell before running cyana (since talos needs a cshell to run):

csh

Automated NOESY assignment and structure calculation

We will perform an automated NOE restraint assignment and structure calculation by torsion angle dynamics.

The 'flya.prot' file from the automated resonance assignment will be used together with the (unassigned) NOESY peak lists to assign the NOESY peaks and to generate distance restraints. The structure is calculated in cycles, essentially testing the NOE assignment and iteratively refining it, in order to compute the three-dimensional structure of the protein.

Exercise 5: Run noeassign

Copy the 'flyabb' directory and give it the name 'noebb', then delete all the files and data we do not need to reduce clutter and have better oversight.

cp -r flyabb noebb
cd noebb
rm *asn.peaks *exp.peaks *.out *.job
rm -rf details

From the directory 'acoPREP' copy the calculated talos restraints ('talos.aco').

Inside the 'noebb' directory, use a text editor to edit the 'CALC.cya' file for noeassign as outlined.

The noeassign CALC macro

peaks:= cnoesy.peaks,nnoesy.peaks,aro.peaks 	
prot:= flya.prot                   		
restraints:= talos.aco                    		
tolerance:= 0.040,0.030,0.45            		
structures := 100,20                      		
steps:= 10000                       		
randomseed:= 434726    
                  		
noeassign peaks=$peaks prot=$prot autoaco

To speed up the calculation, you can set optionally in 'CALC.cya':

structures:=50,10
steps=5000

These commands tell the program to calculate, in each cycle, 50 conformers, and to analyze the best 10 of them. 5000 torsion angle dynamics steps will be applied per conformer. If you do not set these option 100 conformers will be calculate, and the 20 best will be analyzed and kept.


When you are done preparing the macros as outlined run the calculation.

The automated NOE assignment and structure calculation will be performed by running the 'CALC.cya' macro:

cyana -n 33 CALC.cya

Doing this, basically means each processor will calculate 100/33=3 conformers. If you changed the setup to calculate 50 structures, you would start the calculation with 'cyana -n 25 CALC.cya'. 7 cycles of calculations will be preformed.

Statistics on the NOE assignment and the structure calculation will be in the file 'Table', which can also be produced with the command 'cyanatable -lp'.

The final structure will be 'final.pdb'. You can visualize it, for example, with the command

chimera final.pdb

The optimal residue range for superposition can be found with the command

cyana overlay final.pdb

Run noeassign with your 'CALC.cya' macro.

You can check the statistics (and success of 'noeassign') by running:

cyanatable

Creating the ligand library file for CYANA

In the next three exercises you will create the ligand library file for CYANA from scratch. Do this carefully and check your result, otherwise your structure calculation will not work as intended.

Exercise 6: Drawing the molecule and obtaining the SMILES code

  • make a copy of the libex and work in there (libexbb)
cp -r libex libexbb
cd libexbb

Go to the ZINC website.

Click on the Structure tab and draw the molecule using the supplied drawing (LIG.png) of the compound as a guide. Copy the SMILES code.


Hint: To look at the supplied image file in the terminal, use:

xdg-open LIG.png

Exercise 7: Converting the SMILES code to mol2

  • work in the copy of the libex directory ('cd libexbb')

There are many options and programs to do this, we outline two:


If you can use Avogadro (best):

For Mac OS download Avogadro from: Avogadro

Build -- > Insert --> SMILES

Paste the SMILES code

Extensions -- > Optimize Geometry

Save as

--> LIG.mol2 (*.mol2)


If you have to use chimera:

(If you are on the linux server chimera is installed)

Tools --> Structure Editing --> Build Structure Start Structure

--> SMILES string

set the Residue name to LIG (capital letters)

--> Apply Save your mol2 file as: LIG.mol2

Now, there is one issue we have to take care of: The intermolecular NOE assignments have to match the ligand structure assignment, otherwise the intermolecular NOEs will be wrong.

Using the text editor of your choice, manually change the "UNL1" in your mol2 file to "LIG".

Then open the supplied demoLIG.pdb structure in chimera, as well as the created mol2 structure.

chimera demoLIG.pdb LIG.mol2

First check the geometry (especially the rings and the stereochemistry). If it is wrong fix it!

Using the text editor of your choice, or more conveniently using chimera change the proton names in your mol2 file to match those of the pdb.

Hint: Overlay the two ligand structures in chimera.

Favorites --> Command Line

In the command line enter:

match #1 #0

Depending on how the models are loaded you may need to change the #? numbers. To see the model number use the Favorites --> Model Panel.

If chimera complains about "Unequal numbers of atoms chosen for evaluation", delete the pseudo atoms of 'demoLIG.pdb' temporarily for the overlay. In the command line

sel: @Q @Q? @Q??
delete sel

To rename selected atoms (control click) in the command line:

setattr a name HX sel

Hovering over atoms will display their names!

Exercise 8: Converting the mol2 file to a lib file for CYANA

  • work in the copy of the libex directory ('cd libexbb')
  • unpack the tool to convert the mol2 to a *.lib file
tar zxf cylib-2.0.tgz

run cylib with the options -nc -sc

./cylib-2.0/cylib -nc -sc LIG.mol2

this will create the LIG.lib file.

The -sc option keeps the angles of the rings fixed. We can do this since they are in this molecule either aromatic or have sp3 conjugated carbons in them, fixing the ring geometry. If they had to be flexible, you would need to keep the angeles flexible and supply additional restraints to close the rings.

To test the lib file we need CYANA:

Create a sequence file containing 'LIG 333' and name the file 'LIG.seq'.

Start CYANA This will read the CYANA library file correctlly but give you the error:

*** ERROR: Illegal residue name "LIG".
*** ERROR: Cannot read line 1:
                     LIG     333

Because we do not have an init file and have not read the 'LIG.lib' file yet, the program just tries to read the default sequence file in the directory, but the ligand is not yet in the library, so it fails...

read lib LIG.lib append
read seq LIG.seq

anneal

atoms select "* - &DUMMY"

pseudo=1
write pdb test.pdb selected

the command pseudo=1 ensures that the pseudo atoms will be in the written pdb file, 'atoms select "* - &DUMMY"' followed by 'write *.pdb selected' prevents the dummy atoms of the linker to be written to pdb.


Hint: Since you might have to do this a few times, until the library is working and correct, it might be worthwhile to create a 'init.cya' and a 'CALC.cya' macro with the respective commands. This to speed things up and prevent the error output shown above.


Carefully analyze the WARNING and ERROR messages if any.

Then take a look at your lig.pdb in chimera and check that the chemistry and bonds are all as expected (ring closure!)

chimera test.pdb

Again overlay the 'LIG.pdb' with the provided 'demoLIG.pdb'.

If there are any issues "go back to the drawing board" to fix the issues. Carefully check the names also of the pseudo atom names, since they are used in intermolecular-NOEs later.

To help find problems, you may use the command:

write lib LIG.lib names

This will write the library file containing actual atom names rather than numbers.

Alternative Exercise 6-8: Converting a pdb file to a lib file for CYANA

In case you were unsuccessful with exercises 6-8 in terms of getting a working ligand library file, do not dispair! There is an easy workaround that you may be able to use in the real case as well, converting a pdb file to a library file for CYANA.

Use Avogadro:

File --> Open

Open the 5c5aLig.pdb

Save as

LIG.mol2 (*.mol2)

Rename the Residue to LIG in the LIG.mol2 file.

 ./cylib-2.0/cylib -nc -sc LIG.mol2

Done!

You can run the tests outlined above, using anneal etc to test your library file.

Calculating the structure of the protein-ligand complex

Exercise 9: (Semi-automatic) Intermolecular cross peaks assignment and structure calculation

Since the molecular system contains protein and ligand, CYANA has to read the 'LIG.lib' file in addition to the regular 'cyana.lib' file. The sequence file needs to contain the protein and the ligand (and a linker to connect the two).

Copy the noebb directory and give it the name noecc, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.

cp -r noebb noecc
cd noecc
rm *cycle* *.out *.job final* rama*

Update the 'init.cya' file in order to read the ligand library file and the sequence file containing the linker and the ligand.

Ad the 'read lib LIG.lib append' following the 'cyanalib' read command but before reading the sequence. 'append' is necessary, otherwise the 'cyana.lib' file will be overwritten by the 'LIG.lib' file.

rmsdrange:=15-111,333
cyanalib 

read lib LIG.lib append
read seq demoLong.seq

Intermolecular cross peaks we assign by supplying noeassign an intermolecular xeasy peak list with just the ligand resonances assigned. The ligand resonance were assigned manually and determined from an additional set of experiments (the semi-automatic part). Thereby the resonance assignment matches the ligand atom assignment in the library file created in the previous exercise.

The protein side will then be assigned by noeassign.

Update the your previous 'CALC.cya' macro by adding the intermol-NOEs.peaks to the peaks list and adding the keep=all option to the noeassign command:

peaks:= cnoesy.peaks,nnoesy.peaks,aro.peaks,intermol-NOEs.peaks 	
prot:= flya.prot                   		
restraints:= talos.aco                    		
tolerance:= 0.040,0.030,0.45            		
structures := 100,20                      		
steps:= 10000                       		
randomseed:= 434726   
write_peaks_names=.true.
assign_noartifact:="** list=intermol-NOEs.peaks"
noeassign peaks=$peaks prot=$prot keep=all selectcombine="* - @LIG" autoaco

The command 'assign_noartifact' effectivly disables network anchoring tests for the ligand. Since the list supplied is cleaned and presumed artifact free, we are allowed to do this. We therby encourage the use of the intermolecular NEOs even if the support by other nearby NOEs is weak. The command 'write_peaks_names=.true.' ensures that the assigned peak list are written to file with the actual resonance names (this is not xeasy standard).

You can run the calculation again, commenting out (#) the 'assign_noartifact' command, and see the effect on the final structure.

'selectcombine' calls for testing for errors to be done different: Intermolecular peaks do not have to compete with intra protein peaks.

Run the calculation:

cyana -n 33 CALC.cya

Comparing the calculated NMR structure to an XRAY reference structure

Exercise 10: Compare the NMR structure to the Xray structure

Download (www.rcsb.org) the xray structure with ID: 5c5a

Use either a web-browser or the terminal:

wget 'https://files.rcsb.org/download/5c5a.pdb'

Using chimera it is possible to compare two structures, by overlaying and inspecting visually.

When you have your xray structure ready, load your calculated nmr structure and the xray structure in chimera.

Use to chimera specific commands to overlay the two structures and compare the structures visually.

Exercise 11: Preparing an xray structure to use within CYANA

Deposited structures often lack specific features. i.e. Xray structures usually lack proton coordinates.

Copy your noecc results to a new directory call regulabb, then delete all the previous, unnecessary output files to reduce clutter and have better oversight.

cp -r noecc regulabb
cd regulabb
rm *cycle* *.out *.job

After reading the sequence file, the pdb file can be read with the option unknown=warn or unknown=skip, this will then skip the parts of the molecule not specified in the sequence file.

read pdb xxxx.pdb unknown=warn

Other options to read pdb's:

read 5c5a.pdb unknown=warn hetatm new

where the option 'hetatm' allows for reading of coordinate labeled HETATM, rather than ATOM in the pdb. 'new' will read the sequence from the pdb.

To write back out pdb's and sequences:

write pdb XXX.pdb
write seq XXX.seq  

Inspect the pdb using chimera: Now, there are several issues besides HETATM, that make the comparison to the calculated NMR structure not possible within CYANA before you fix them. You may use a graphical text editor to fix them. In the end, you need to have a conformer of the complex ready to compare with the calculated NMR structure.

Best would be to practice the use of the 'regularize' command as well. This is however not really necessary in this particular case, since this xray structure contains proton coordinates. Using the regularize command one can get a structure calculated within CYANA that has these features but still is very close to the input structure of your choice.

Copy your 'LIG.lib' file and name it 'NUT.lib', in the 'NUT.lib' file change the residue name from LIG to NUT. The 'NUT.lib' file is necessary to read the original xray structure with ligand into CYANA.

Copy the 'demoLong.seq' file and name it 'demoLongEd.seq', in the 'demoLongEd.seq' file delete the linker residues.

Create an 'init.cya' macro with:

cyanalib
read lib LIG.lib append

Then create a 'CALC_reg.cya' macro with:

read lib NUT.lib append
read 5c5a.pdb unknown=warn hetatm new
write 5c5a_Ed.seq 
write 5c5a_Ed.pdb

#renumber and rename the ligand from 201 333, NUT to LIG
library rename "@NUT" residue=LIG
atoms select @LIG
atoms set residue=333

write 5c5a_renum.seq
write 5c5a_renum.pdb

#sequence with ligand but without linker
read demoLongEd.seq
read 5c5a_renum.pdb rigid unknown=warn

write XrayAChainRenum.pdb

initialize

read seq demoLong.seq
read pdb XrayAChainRenum.pdb unknown=warn

write pdb test.pdb

read pdb test.pdb
regularize steps=20000 link=LL keep

Execute the 'CALC_reg.cya' macro in the CYANA shell (or use only one processor, do not distribute the job):

cyana CALC_reg.cya

Exercise 12: Calclulate the RMSD of NMR vs. xray structure using a CYANA macro

Using the INCLAN language of CYANA (Writing and using INCLAN macros,Using INCLAN variables,Using INCLAN control statements) it is possible to write complex macros that interact with the FORTRAN code of CYANA. Reading internal variables and manipulating them to achieves custom task.

  • save the manually edited xray structure (exercise 11) or the the regularized xray structure (containing the ligand and called 'regula.pdb') as 'reg_xray.pdb' to use the macro below (or change the name in the macro accordingly).
  • what do you think about the RMSD, does the value make sense? Does the range make sense?

Below you find the commands for a macro (call it 'CALC_RMSD.cya') that will read the regularized xray structure and the calculated nmr structure, then calculating the rmsd of both the protein and ligand parts of the complex:

read demoLong.seq

rmsd range=15-111 structure=final.pdb reference=reg_xray.pdb

atom select "BACKBONE 15-111"
t=rmsdmean
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1) then
    n=n+1
    s=s+displacement(i)
   end if
end do
print "RMSD of the LIG: ${s/n} ($n atoms)"

read pdb final.pdb
structure mean
write pdb mean.pdb

read pdb mean.pdb
read pdb reg_xray.pdb append

atom select "BACKBONE 15-111"
t=rmsdmean
atom select "WITHCOORDALL"
j=rindex('333')
n=0
s=0.0
do i ifira(j) ifira(j+1)-1
  if (element(i).gt.1.and.asel(i)) then
    n=n+1
    s=s+displacement(i)*2
  end if
end do
print "Displacement of the LIG (to ref xray): ${s/n} ($n atoms)"

Beyond The Basics: Improving the final structure

FLYA options

There are a variety of commands to modify FLYA runs to accommodate experimental labeling schemes or apply previous assignments etc...

Modify the chemical shift statistics used for assignment

Supply user-defined chemical shift statistics instead of standard BMRB statistics from library and replace the general statistics from 'cyana.lib' (CSTABLE).

  • average value and stddev from input chemical shift list 'shiftx.prot'
  • 'assigncs_sd:=bmrb' to use stddev from BMRB ('cyana.lib') instead of input chemical shift list
  • 'assigncs_sdfactor:=0.5' to scale BMRB stddev by given factor
shiftassign_statistics:=predicted.prot


Modify the reported statistics

Groups of atoms for which assignment statistics will be calculated and reported in the 'flya.txt' output file can be defined as:

analyzeassign_group := BB: N H CA CB C

In this case, the command defines a group called BB (a name that can be chosen freely) comprising the atoms N, H, CA, CB, C.


The optional parameter 'shiftreference=manREF.prot' specifies reference chemical shift list, used only for comparison in flya.tab, flya.txt, flya.pdf:

 shiftassign_reference:=manREF.prot

The same parameter may also be set as part of the flya command:

flya runs=10 assignpeaks=$peaks shiftreference=manREF.prot


Modify the expected peak lists

Specific labeling can be handled and peak list-specific atom selections can be applied.

To restrict the generation of expected peaks to a subset of atoms, here the backbone atoms:

command select_atoms
  atom select "N H CA CB C"
end


Input structures may be used to generate expected peaks for through-space experiments:

  • specify with parameter 'structure' of the command 'flya'
  • if parameter 'structure' is absent, a set of random structures is generated automatically
  • if set to blank ('structure='), no random structures are generated (if not needed because only through-bond spectra are used)
flya runs=10 assignpeaks=$peaks structure=XXX.pdb

Experimental peaks may also be employed as expected peak lists:

  • command N15NOESY_expect, reading input peak list N15NOESY_in.peaks
N15NOESY_expect :=N15NOESY_in


Keeping previously determined assignments

To keep input peak assignments in user peak assignments:

  • (partially) assigned input peak list XXX.peaks
  • parameter 'keepassigned' for 'loadspectra.cya'
loadspectra_keepassigned:=.true.

To fix input chemical shift assignments contained in a prot file

To do this i.e for backbone atoms extracted from the manREF.prot list:

Make a list of only the reference backbone chemical shifts by entering the CYANA commands:

read manREF.prot
  atom set "* - H N CA CB C" shift=none
write fix.prot

The file 'fix.prot' will contain the reference chemical shifts only for the backbone (and CB) atoms H, N, CA, CB, C'. Now you can repeat the assignment calculation by inserting the 'shiftassign_fix:=fix.prot' statement in 'CALC.cya' and choosing only the input peak lists that are relevant for sidechain assignment:

shiftassign_fix:=fix.prot


Chemical shift assignment using exclusively NOESY

  • increased population size with 'shiftassign_population=200'
  • see Schmidt et al. J. Biomol. NMR 57, 193-204 (2013)


Speeding up FLYA runs

Serves the fast automated chemical shift assignment and means the results in general are less accurate since either the populations are smaller, there are less parallel runs or the optimization schedule is modified.

In production runs, better results can be expected (at the expense of longer computation times) if these parameters are not set.


There are three parameters of the assignment algorithm that can be set in order to speed up the calculation.

Fixed number of generations in evolutionary optimization:

shiftassign_population=25

The population size for the genetic algorithm, i.e. how many assignments form one generation (25; chosen smaller than in normal production runs in order to speed up the calculation).

There is also an option to choose the "quick" optimization schedule:

shiftassign_quick=.true.

And last the 'runs' option can be set for flya as we did in the exercise ('flya runs=10').


neoassign options

To learn more about noeassign consult the tutorial Structure calculation with automated NOESY assignment. Other options for neoassign are described here: CYANA_Macro:_noeassign

Exercise 13: Mapping restraints onto a known structure

One can map the calculated restraints, such as distance restraints (upl/lol) onto a known structure (in the example here an xray structure). This is another approach to analyze restraints and their influence on the results.

Below you find the commands to accomplish this. You see by studying the commands, which files are needed to execute the macro. Therefore, create a new directory ('mkdir') or copy a directory containing the respective files. Delete what you do not need. Use the regularized xray structure from exercise 11.

Commands preceded by hashtags (#) are commented out, remove the hashtags if you want to use them. If you decide to use the intermo-NOEx-cycle7.peaks file, make sure to comment any commands you no longer need.

You need an init file:

rmsdrange:=15-111,333
cyanalib
read lib LIG.lib append

And the main macro (name it 'CALC_xraymap.cya'):

read seq demoLong.seq

The following block of commands, takes the assigned intermol.peaks list and calculates distance restraints from the peak intensities:

#peaks:=intermol-NOEs-cycle7.peaks
#calibration peaks=$peaks
#peaks calibrate simple
#write upl intermol.upl

The following block of commands, reads the 'final.upl' list (in this case of neoassign) and selects the intermolecular NOEs to LIG and writes them to file:

read upl final.upl
distance select "*, @LIG" info=full
write intermol.upl
read intermol.upl unknown=warn

#read upl lig.upl append
#read lol lig.lol

read regula.pdb unknown=warn

weight_vdw=0
overview intermol_xray.ovw
  • If the restraints do not match with the xray structure, does it mean they are wrong?
  • If you tried the two options, what is (are) the difference(s)?
  • Did you look at the LIG.upl/lol files in the demo_data folder, what are they? What type of NMR experiments are there to obtain them?

Exercise 14: Work on improving the final structure

Using what you have learned so far, employing some of the options of FLYA and noeassign, consider if it is possible to improve the resolution of the final structure.


General questions to answer regarding this task:

  • Name additional experimental restraints (or inputs) you could use for structure calculation.
  • Name additional NMR experiments you could measure, to acquire experimental data that are not supplied with the demo_data.