UbnmrConsensusDetails

From NESG Wiki
Jump to navigation Jump to search

Details of UBNMR consRun command

NOTE: The UBNMR consRun command assumes that the peaks in the three peaklists used, the original xeasy peaklist, the new cyana assigned peaklist and the autostructure/sparky assigned peaklist are in the exact same order. As long as the same original peaklist is used as input to the cyana21 and autostructure runs, this will be the case. The consRun command also assumes that cyana peaks are in ascending order. There can be missing peaks, but there cannot be peaks that are out of numerical order.

The sequence list and atomlist (protlist file) must be read into UBNMR before attempting the consRun command.


Algorithm

  • All three peaklist are read into peaklist objects. The cyana peaklist object is somewhat different from the sparky peaklist object. The peaks in the cyana peaklist object are indexed by the peakIndex, which is the first number in each line of the cyana21 peaklist. The peaks in the sparky peak object are indexed by the line number in consecutive order.
  • Multiple assignments are parsed and stored as "additional assignments" in each peaklist object. A variable numAss is maintained for each peak.
  • The program goes through each "peak" (the cyana assignment peaklist and the sparky assignment list) in consecutive order.
  • For each peak, the chemical shifts are compared between sparky and cyana21, if the chemical shifts for this peak are not the same between cyana and autostructure, the command is aborted with a FATAL ERROR message. This indicates that either the peak lists between the two programs did not come from the same input, or the peaks in the cyana lists are out of numerical order.
  • First, the peakIndex is checked. if it is less than the reserve number, the peak assignment from the original xeasy peaklist is put in the output peaklist.
  • For each peak, the number of assignments are checked, ONLY if both autostructure and cyana have ONLY ONE assignment are the assignments checked for a match. Otherwise, any possible assignments for this peak are put in the xeasy .assign file and the peak is unassigned and put in the output peaklist.
  • If each peaklist shows one assignment, the autostructure assignment is translated to be more like the cyana assignments according to the following checks:
  • "H" becomes "HN"
  • "HA2" becomes "HA1"
  • " HA3" becomes "HA2"
  • In some cases, cyana uses QB but sparky uses HB, if autostructure says HB, the protlist is checked to see if HB is valid for this residue, if not, it becomes QB.
  • If "HD1" is not valid for this residue, it becomes "QD1"
  • If "HD2" is not valid for this residue, it becomes "QD2"
  • if "HG1" is not valid for this residue, it becomes "QG1"
  • if "HG2" is not valid for this residue, it becomes "QG2"
  • The psuedo atom usage of autostructure and cyana are different. Autostructure may assign a psuedo atom if two atoms have very similar chemical shifts. The psuedo atom usage is checked by seeing if the PX exists in the protlist. It it does, then this is left as a QX atom.
  • If the psuedo atom does not exist in the peaklist, it has been imposed by autostructure. Check to see if the attached atom has a specific assignment, ie if a QCD is attached to a HD1 then we change the QCD to CD1. See Key issues one level back. If it does not, then this autostructure psuedo atom assignments are flaged as "promiscuous" by chaning the Q to a P
  • Finally, the match is checked. The residues must be the same and the atoms must either be the same or have a "permiscuous match" ie.
  • PB matches HB2 or HB3
  • PG matches HG2, HG3, QG2 or QG3
  • PD matches HD2, HD3, QD2 or QD3
  • PCD matches CD1 or CD2
  • PCE matches CE1 or CE2
  • Matched peaks are put in the output peaklist.
  • Mismatched peaks are unassigned and put in the output peaklist. Also, possible assignments are put in the xeasy assignment file.



-- DavidParish - 02 Oct 2007