Assign: Difference between revisions

From CYANA Wiki
Jump to navigation Jump to search
m (1 revision)
 
m (1 revision)
Line 18: Line 18:
== Description ==
== Description ==


The '''assign''' command performs automated assignment of the NOESY
    For reasons of space, only the first few contributions are printed.
cross peaks on the basis of the given chemical shifts, knowledge of
    An ellipsis "..." followed by the total number of contributions
covalently constrained short distances, and the selected 3D conformers,
    in parenthesis indicates that not all contributions with probability
if available. The '''assign''' command is used in the '''noeassign''' macro
    greater than 1% are printed.
to implement a combined automated NOESY assignment and structure
* Line 8 (last line): Number of conformers in which the upper distance
calculation strategy.
  limit of the ambiguous distance restraint formed by the accepted
  assignments (marked by '''+''' in lines 3-7) is violated by more than
  the ''violation'' threshold, and the average size of the violation.


Input data:
Covalently contrained distances:


Required input data consists of unassigned (or assigned) NOESY
The covalently constrained short distances are normally taken from
peaks from one or several peak lists, and one or several chemical
distance restraints with weight zero, which can be obtained, for
shift lists.  Optional input data comprises a group of selected
instance, by analyzing a bundle of randomized conformers with the
conformers and a list of covalently constrained short distances. To
'''distance short''' command, as implemented in the '''noeassign''' macro. If
each input peak an upper distance bound must have been attributed,
no distance restraints with weight zero exist, the short distances
for instance using the '''peaks simplecal''' command or the '''calibration'''
are calculated internally from the select conformers (which should
macro that convert peak intensitites or volumes into distance bounds.
be randomized), if available and if ''violation'' is negative, or by
an analytical calculation otherwise.


Output data:
Elasticity of upper distance bounds:


Output data comprises assignments made by the '''assign''' command for
When searching for peak assignments the algorithm can adapt individual
the peaks that were NOT selected in the input peak lists, as well as a
upper distance bounds in the input peak lists by a factor within
report including details on the assignment of each individual peak and
the allowed ''elasticity'' range. An individual upper bound can be
a summary table. Peaks that were selected on input are not modified. If
increased if a slight violation of the original upper distance bound
peaks are assigned and unselected on input, the report also provides
can be avoided by the increased distance limit in at least 80% of
a comparison between the input assignment and the new assignment made
the conformers. An individual upper bound can be decreased if the
by the '''assign''' command that overwrites the input assignment.
actual distances in the input conformers are consistently shorter
 
than the upper distance bound. By default, there is no "elasticity"
Assignment strategy:
of the upper distance bounds, i.e. the input distance limits are used
 
without change. If an upper distance is changed, its modified value
First all assignment possibilities of a peak are generated on the
is indicated in the first line of the report on the assignment of the
basis of the chemical shift values that match the peak position
peak. The additional option '''changevol''' can be used to correct peak
within the tolerance defined by the '''tolerance''' variable. Second,
volumes according to the internal change of the corresponding upper
the probability for agreement with the bundle of selected conformers,
distance bound using an inverse sixth power relationship.
if present, is computed as the fraction of the conformers in which the
corresponding distance is shorter than the upper distance bound plus
the acceptable ''violation'', and assignment possibilities for which the
product of these two probabilities is below the required ''probability''
threshold are discarded. Third, each remaining assignment possibility
is evaluated for its network anchoring, i.e., its embedding in the
network formed by the assignment possibilities of all the other peaks
and the covalently constrained distances. The network anchoring
probability that the distance corresponding to an assignment is
shorter than the upper distance bound plus the acceptable ''violation''
is computed given the assignments of the other peaks but independent
from knowledge of the three-dimensionl structure.  Only assignment
possibilities for which the product of the three probabilities is
above the required ''probability'' threshold, are accepted. Next the
overall quality Q of the assignment of a peak is computed from the
probabilities of its individual accepted assignment possibilities. The
overall quality of a peak assignment is always at least as large as
the highest probability of an accepted assignment possibility. Peaks
are kept assigned only if their quality exceeds the ''quality'' cutoff.
 
Example assignment report for a peak:
 
  Peak 165 from c13.peaks (8.72, 4.11, 59.86 ppm; 3.08 A):
  2 out of 4 assignments used, quality = 0.97:
  * H    ILE  64 + HA    ILE  63  OK    90    99 100  91  2.1-2.3  1260=69, 63/50=24...(10)
    H    ILE  63 + HA    ILE  63  OK    71    71 100 100  2.8-2.8  3.0=100
    H    SER  43 - HA    ILE  63  far    0    95  0  -  6.4-9.0
    H    ALA  22 - HA    ILE  63  far    0    99  0  -  9.9-14.6
  Violated in 0 structures by 0.00 A.
 
* Line 1: Peak number, peak list, peak position, upper distance bound.
* Line 2: Number of used assignments, number of assignment possibilities,
  overall quality of the peak assignment (0..1). Quality values below
  the ''quality'' cutoff are marked as "low quality", and the peak remains
  unassigned.
* Lines 3-7: Individual assignment possibilities


Additional control parameters:
Additional control parameters:
Line 102: Line 69:
may be scaled by a ''confidence'' factor between 0 and 1. Chemical
may be scaled by a ''confidence'' factor between 0 and 1. Chemical
shift assignments with an attached chemical shift error larger than
shift assignments with an attached chemical shift error larger than
the ''unassigned'' cutoff are treated as unassigned when determining
the ''unassigned'' cutoff are treated as "unassigned" when determining
the initial assignment possibilities of peaks: Only one of the two
the initial assignment possibilities of peaks: Only one of the two
atoms of an assignment may be unassigned, and, if in addition the
atoms of an assignment may be "unassigned", and, if in addition the
'''short''' option is set, only short-range assignments for covalently
'''short''' option is set, only short-range assignments for covalently
constrained distances are considered.
constrained distances are considered.
Line 131: Line 98:
* Guntert. Meth. Mol. Biol. 278, 353-378 (2004).
* Guntert. Meth. Mol. Biol. 278, 353-378 (2004).
* Guntert. Prog. NMR Spectrosc. 43, 105-125 (2003).
* Guntert. Prog. NMR Spectrosc. 43, 105-125 (2003).
* Jee and Guntert. J. Struct. Funct. Genom. 4, 179-189 (2003).
* Jee & Guntert. J. Struct. Funct. Genom. 4, 179-189 (2003).


== See also ==
== See also ==

Revision as of 18:02, 28 January 2009

Parameters

alignfactor=real
(default: 0.5)
matchfactor=real
(default: 0.5)
violation=real
(default: -1.0)
probability=real
(default: 0.2)
quality=real
(default: 0.5)
elasticity=real range
(default: 1.0..1.0)
confidence=real
(default: 1.0)
supportweight=real
(default: 1.0)
prefer=integer
(default: 999999)
interrange=integer range
(default: 0..)
unassigned=real
(default: 0.1)
short
changevol

Description

   For reasons of space, only the first few contributions are printed.
   An ellipsis "..." followed by the total number of contributions
   in parenthesis indicates that not all contributions with probability
   greater than 1% are printed.
  • Line 8 (last line): Number of conformers in which the upper distance
 limit of the ambiguous distance restraint formed by the accepted
 assignments (marked by + in lines 3-7) is violated by more than
 the violation threshold, and the average size of the violation.

Covalently contrained distances:

The covalently constrained short distances are normally taken from distance restraints with weight zero, which can be obtained, for instance, by analyzing a bundle of randomized conformers with the distance short command, as implemented in the noeassign macro. If no distance restraints with weight zero exist, the short distances are calculated internally from the select conformers (which should be randomized), if available and if violation is negative, or by an analytical calculation otherwise.

Elasticity of upper distance bounds:

When searching for peak assignments the algorithm can adapt individual upper distance bounds in the input peak lists by a factor within the allowed elasticity range. An individual upper bound can be increased if a slight violation of the original upper distance bound can be avoided by the increased distance limit in at least 80% of the conformers. An individual upper bound can be decreased if the actual distances in the input conformers are consistently shorter than the upper distance bound. By default, there is no "elasticity" of the upper distance bounds, i.e. the input distance limits are used without change. If an upper distance is changed, its modified value is indicated in the first line of the report on the assignment of the peak. The additional option changevol can be used to correct peak volumes according to the internal change of the corresponding upper distance bound using an inverse sixth power relationship.

Additional control parameters:

The probability for the chemical shift matching is calculated using the tolerance values multiplied by matchfactor. A smaller matchfactor implies a higher weight for good agreement between the peak coordinates and the chemical shifts. The mutual alignment of peaks is controlled by the variable tolerance, and the probability for network anchoring is calculated using the tolerance values multiplied by alignfactor. A smaller alignfactor implies a higher weight for good mutual alignment between peaks with assignment possibilities to the same atom(s). When calculating the network anchoring probability of a given peak assignment, the probabilities of other aligned peaks may be scaled by a confidence factor between 0 and 1. Chemical shift assignments with an attached chemical shift error larger than the unassigned cutoff are treated as "unassigned" when determining the initial assignment possibilities of peaks: Only one of the two atoms of an assignment may be "unassigned", and, if in addition the short option is set, only short-range assignments for covalently constrained distances are considered.

Symmetric homodimers:

The assign command provides special features for symmetric homodimers that can be defined with the molecules define command. In the case of a homodimer, only assignments with the first atom in the first monomer are made. The corresponding symmetric distance restraint can be added afterwards with the molecules symmetrize command. Homodimer assignments are restricted to be only intramolecular or only intermolecular for peaks with (XEASY) color codes 8 or 9, respectively. Furthermore, intermolecular homodimer assignments between residues i and j are considered only if |i-j| is within the interrange. Intermolecular assignments of a peak are also excluded if the peak has at least one intramolecular assignment between residues i and j with |i-j| smaller than prefer.

Further reading:

  • Herrmann et al. J. Mol. Biol. 319, 209-227 (2002).
 (Note that the algorithm implemented in the assign command differs
 significantly from the original CANDID algorithm described in this
 publication.)
  • Guntert. Meth. Mol. Biol. 278, 353-378 (2004).
  • Guntert. Prog. NMR Spectrosc. 43, 105-125 (2003).
  • Jee & Guntert. J. Struct. Funct. Genom. 4, 179-189 (2003).

See also