Identification of key NOEs

From CYANA Wiki
Jump to navigation Jump to search

In this tutorial we will provide you with guided examples for identification of key NOEs for a two-state eNOE calculation. Keep in mind that listed approach can be generalized to any other type of calculation.

In summary, our approach consists of following steps:

  1. We extract all long range NOEs
  2. For each long range NOE we run a two-state structure calculation missing this particular NOE
  3. We evaluate structure calculations in terms of target function and correlations (using PDBcor)

Software installation

This tutorial requires following software:

  1. CYANA
  2. Python3
  3. UCSF Chimera
  4. PDBcor. In case if PDBcor is not yet installed, please go to this link and follow installation instructions. Later we will refer to the PDBcor installation path as PATH/To/PDBcor

Data preparation

Please follow the following steps:

  1. Download the demo data.
  2. Unpack the demo data

Execution

We recommend to use parallel computation for the execution as it will significantly reduce the total running time.

Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):

First, edit the protein specific data in the folder data/. Make sure, that filenames are kept the same as in the demo. Advanced users can ignore it and fix PREP.cya for potential naming errors (keep in mind that RUN.cya will modify the distance restraints to filtered.upl and filtered.lol by removing a particular selected long distance NOE). Change the protein sequence in the root folder (same filename is not required).

Additionally, do following checks:

Check that init.cya and CALC.cya have rmsdrange variable (range to calculate RMSD of the protein bundle) set to correct value

Check that SPLIT.cya has nres variable (number of residues) set to correct value

Then, test the data/ folder by running a test calculation (we will create filtered restraints as in input only for test purposes):

cp -r data test
cd test
mv demo.upl filtered.upl
mv demo.lol filtered.upl
cyana PREP.cya
cyana -n 20 CALC.cya #cyana CALC.cya if no parallel computing is available

Check that both PREP and CALC scripts run without errors.

Check in program Chimera that resulting structure bundle.pdb is correct.

Check that split structure splitall.pdb is split correct (each conformer 1-nres is a separate PDB model).

Then, edit RUN.cya:

Edit upl and lol filenames

read upl data/demo.upl
read lol data/demo.lol

Edit the cyana engine

cyana:=cyana (do it only if you are sure what it is)

If you do not have an access to parallel computation change the line of RUN.cya from

system "pwd; ls -l CALC.cya; $cyana -n 20 CALC inputseed=$seed"

to:

system "pwd; ls -l CALC.cya; $cyana CALC inputseed=$seed"

Then, run the series of two-state structure calculations:

cyana RUN.cya

This creates a folder for each two-state structure calculation. In the calculation folder omitted NOE restraint is saved as a omitedrestraint.txt file and new restraint files are saved as filtered.upl and filtered.lol. Final structure is saved as bundle.pdb and final split structure (where each state is an independent model is saved as splitall.pdb).

If you do not have an access to parallel computation change the line of ANALYSE.cya from

system "bsub python PATH/To/PDBcor/correlationExtraction.py $dir/splitall.pdb --therm_iter=5"

to:

system "python PATH/To/PDBcor/correlationExtraction.py $dir/splitall.pdb --therm_iter=5"

Then, activate the PDBcor environment and run the correlation analysis:

source PATH/To/PDBcor/venv/bin/activate
cyana ANALYSE.cya

This creates a correlations/ subfolder in each structure calculation folder with a correlation value that can be read from the correlations_backbone.txt file.

Finally, collect results in a single file:

cyana STAT_COLLECT.cya

This collects all correlation and target function values of all executed calculations in a single file using bash script extract_cor_value.sh into the output file results.txt and then sorts them according to the target function (results_tf.txt) and correlations (results_cor.txt).