Determination of the protein state populations: Difference between revisions
(Created page with "In this tutorial we will provide you with guided examples for determination of protein state populations. In summary, our approach consists of following steps: # We conduct ...") |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
In summary, our approach consists of following steps: | In summary, our approach consists of following steps: | ||
# We conduct a series of 10 state structure calculations with varying population parameter | # We conduct a series of 10-state structure calculations with varying population parameter | ||
# We evaluate structure calculations in terms of target function and correlations (using PDBcor) | # We evaluate structure calculations in terms of target function and correlations (using PDBcor) | ||
Line 31: | Line 31: | ||
Check that '''init.cya''' and '''CALC.cya''' have '''rmsdrange''' variable (range to calculate RMSD of the protein bundle) set to correct value | Check that '''init.cya''' and '''CALC.cya''' have '''rmsdrange''' variable (range to calculate RMSD of the protein bundle) set to correct value | ||
Check that ''' | Check that '''SPLITpop.cya''' has '''nres''' variable (number of residues) set to correct value. Keep in mind that in this case SPLITpop only takes one sample from the population A and population B. This means that SPLITpop will output a PDB with 40 conformers in 40 separate PDB models given both states A and B are populated or 20 conformers in 20 separate PDB models otherwise. | ||
Then, test the '''data/''' folder by running a test calculation: | Then, test the '''data/''' folder by running a test calculation: | ||
Line 61: | Line 61: | ||
If you do not have an access to parallel computation change the line of ANALYSE.cya from | If you do not have an access to parallel computation change the line of ANALYSE.cya from | ||
system "bsub python | system "bsub python '''PATH/To/PDBcor'''/correlationExtraction.py $dir/splitall.pdb --therm_iter=5" | ||
to: | to: | ||
system "python | system "python '''PATH/To/PDBcor'''/correlationExtraction.py $dir/splitall.pdb --therm_iter=5" | ||
Then, activate the PDBcor environment and run the correlation analysis: | Then, activate the PDBcor environment and run the correlation analysis: |
Latest revision as of 12:42, 7 September 2021
In this tutorial we will provide you with guided examples for determination of protein state populations.
In summary, our approach consists of following steps:
- We conduct a series of 10-state structure calculations with varying population parameter
- We evaluate structure calculations in terms of target function and correlations (using PDBcor)
Software installation
This tutorial requires following software:
- CYANA
- Python3
- UCSF Chimera
- PDBcor. In case if PDBcor is not yet installed, please go to this link and follow installation instructions. Later we will refer to the PDBcor installation path as PATH/To/PDBcor
Data preparation
Please follow the following steps:
- Download the demo data.
- Unpack the demo data
Execution
We recommend to use parallel computation for the execution as it will significantly reduce the total running time.
Please follow the following steps carefully (exact Linux commands are given below; you may copy them to a terminal):
First, edit the protein specific data in the folder data/. Make sure, that filenames are kept the same as in the demo. Advanced users can ignore it and fix PREPpop.cya for potential naming errors. Change the protein sequence in the root folder (same filename is not required).
Additionally, do following checks:
Check that init.cya and CALC.cya have rmsdrange variable (range to calculate RMSD of the protein bundle) set to correct value
Check that SPLITpop.cya has nres variable (number of residues) set to correct value. Keep in mind that in this case SPLITpop only takes one sample from the population A and population B. This means that SPLITpop will output a PDB with 40 conformers in 40 separate PDB models given both states A and B are populated or 20 conformers in 20 separate PDB models otherwise.
Then, test the data/ folder by running a test calculation:
cp -r data test cd test cyana PREPpop.cya populations=5,5 cyana -n 20 CALC.cya #cyana CALC.cya if no parallel computing is available
Check that both PREP and CALC scripts run without errors.
Check in program Chimera that resulting structure bundle.pdb is correct.
Check that split structure splitall.pdb is split correct (each conformer 1-nres is a separate PDB model).
Then, edit RUN.cya:
Edit the cyana engine
cyana:=cyana (do it only if you are sure what it is)
If you do not have an access to parallel computation change the line of RUN.cya from
system "pwd; ls -l CALC.cya; $cyana -n 20 CALC inputseed=$seed"
to:
system "pwd; ls -l CALC.cya; $cyana CALC inputseed=$seed"
Then, run the series of two-state structure calculations:
cyana RUN.cya
This creates a folder for each 10-state structure calculation. Final structure is saved as bundle.pdb and final split structure (where each state is an independent model is saved as splitall.pdb).
If you do not have an access to parallel computation change the line of ANALYSE.cya from
system "bsub python PATH/To/PDBcor/correlationExtraction.py $dir/splitall.pdb --therm_iter=5"
to:
system "python PATH/To/PDBcor/correlationExtraction.py $dir/splitall.pdb --therm_iter=5"
Then, activate the PDBcor environment and run the correlation analysis:
source PATH/To/PDBcor/venv/bin/activate cyana ANALYSE.cya
This creates a correlations/ subfolder in each structure calculation folder with a correlation value that can be read from the correlations_backbone.txt file.
Finally, collect results in a single file:
cyana STAT_COLLECT.cya
This collects all correlation and target function values of all executed calculations in a single file using bash script extract_cor_value.sh into the output file results.txt.