Help for DEDAL on the Essentia Proteomica server

Mode

DEDAL applies three basic procedures:
  1. TS (Tree Search) – Extensive, accurate search based on traversing a decision tree. It returns alignments with the maximal possible score. Alignment may comprise several spatially disjoint regions (e.g. domains).
  2. CTS (Constrained Tree Search) – Modification of the TS algorithm which restricts search to alignments composed of one spatially continuous region. Useful in case of large, single-domain structures.
  3. MC (Monte Carlo) – A Monte Carlo heuristic procedure – approximating the TS algorithm. Useful in difficult cases (i.e. structures which contain repeated structural motifs).
These procedures can be run either on all pairs of similar descriptors or on such pairs which contain at least three similar segments. The second approach is recommended because it prevents generation of poor quality alignments containing several insignificantly small regions. Altogether the DEDAL server offers the following modes:
  1. TS, descriptors with >=3 segments – TS algorithm run on 3-segmented descriptors
  2. CTS, descriptors with >=3 segments – CTS algorithm run on 3-segmented descriptors
  3. MC, descriptors with >=3 segments – MC algorithm run on 3-segmented descriptors
  4. TS, all descriptors – TS algorithm run on all descriptors
  5. CTS, all descriptors – CTS algorithm run on all descriptors
  6. MC, all descriptors – MC algorithm run on all descriptors
  7. TS, descriptors with >=3 segments + CTS refinement on all descriptors – TS algorithm run on 3-segmented descriptors; alignments are further expanded with CTS algorithm using all remaining descriptors (recommended)
  8. CTS, descriptors with >=3 segments + CTS refinement on all descriptors – TS algorithm run on 3-segmented descriptors; alignments are further expanded with CTS algorithm using all remaining descriptors (recommended)
  9. MC, descriptors with >=3 segments + CTS refinement on all descriptors – TS algorithm run on 3-segmented descriptors; alignments are further expanded with CTS algorithm using all remaining descriptors
Modes 6 and 7 are recommended for most cases. Mode 7 effectively restricts the solution space to alignments comprising only one continuous region.
Mode 8 may be useful in rare cases where there exists an alignment not represented by a maximal clique which has the highest score because other larger alignments are of lower quality.
Modes 1 to 3 are faster and are useful for determination of an overall similarity without the refinement stage.
Modes 4 to 6 may be employed for non-globular proteins which may lack pairs of similar 3-segmented descriptors.

Max. sequence offset

Sequence offset is used to obtain sequence dependent comparisons. One assumes that a direct 1:1 correspondence between the protein sequences exists, and only residues aligned with offset not greater than k will be counted. This mode is especially useful for comparing models of the same protein in structure prediction applications. If lengths of compared structures differ, offset is measured from the closest alignment with gaps located only in the shorter sequence.

Max. sequence swaps

Maximal number of segment swaps allowed in the computed alignment. -1 denotes infinity. During computation, if the processed alignment contains more swaps than the given limit, the largest subalignment containing the maximal number of swaps will be considered. E.g. value 1 means that only circular permutations and alignments without swaps will be returned.

Structure

PDB or SCOP code

Code of a protein structure in PDB or SCOP databases. PDB codes are composed of 4 alphanumerical characters (e.g. 1m55), and can be followed by a letter denoting a chosen chain (e.g. 1m55A) (otherwise the whole molecule will be used in an alignment). Codes of SCOP domains are 7 characters long, and begin with letter d (e.g. d1m55a_).

File

A properly formatted all atom PDB file should be uploaded. Coordinates of sidechain atoms are required to determine geometrical centers of residues which are used to compute inter-residue contacts. Structures having fewer than 5 residues are considered invalid. Multiple chains are accepted although submitting entire crystal cells with internal symmetries may significantly increase the computation time providing multiple insignificant alignments.