Tool: Matchmaker
The Matchmaker tool
superimposes protein or nucleic acid structures by first creating
a pairwise sequence alignment, then fitting the aligned residue pairs.
It is also implemented as the
matchmaker command.
Residue types and/or protein secondary structure information
can be used to align the sequences, and the pairwise alignment can be shown
in the Sequence Viewer.
Fitting uses one point per residue: CA in
amino acid residues and C4' in nucleic acid residues.
If a nucleic acid residue lacks a C4' atom
(some lower-resolution structures are P traces),
its P atom will be paired with the P atom of the aligned residue.
To use a different set of atoms, including those not in biopolymer chains,
see the align command instead.
See also:
Fit in Map,
morph,
view,
dssp,
measure rotation,
save PDB
The method was originally implemented in Chimera, as described in:
Tools for integrated sequence-structure analysis with UCSF Chimera.
Meng EC, Pettersen EF, Couch GS, Huang CC, Ferrin TE.
BMC Bioinformatics. 2006 Jul 12;7:339.
Matchmaker can be opened from the Structure Analysis
section of the Tools menu and manipulated like other panels
(more...).
It contains three tabbed sections, explained in detail below:
The following three buttons relate to the
preferences, and there is a
checkbox option as to whether they should apply to all three tabbed sections
or only the one that is currently shown:
- Save saves the current Matchmaker parameters as preferences
- Reset resets the dialog to the factory default
parameter settings without changing any preferences
- Restore populates the dialog with the last saved preferences
Clicking OK or Apply will start the calculations
with or without closing the dialog, respectively.
Close simply closes the dialog, while Help opens
this page in the Help Viewer.
Sequence alignment scores, parameter values, and root-mean-square deviations
(RMSD values) will be reported in the Log
(see also verbose logging).
If the fit is iterated,
the final RMSD over all residue pairs (columns in the sequence alignment)
will be reported along with the RMSD over the pruned set of pairs.
[back to top: Matchmaker]
Chain Pairing
The chain-pairing method dictates what choices of structure
are available in the top section of the dialog.
- Best-aligning pair of chains between reference and match structure
(initial default)
– One reference structure and one or more structures to match
should be chosen. For each structure to be matched, the
reference-match pair of chains with the highest
sequence alignment score will be used.
- Specific chain in reference structure
and best-aligning chain in match structure
– One reference chain and one or more structures to match
should be chosen. For each structure to be matched,
the chain that aligns to the reference chain with the highest
sequence alignment score will be used.
- Specific chain(s) in reference structure
with specific chain(s) in match structure
– One or more reference chains should be chosen from the list.
For each reference chain chosen, one chain to be matched should
be chosen from the corresponding pulldown menu. If multiple chains are to
be matched to the same reference chain, it is necessary to match them
in separate steps (by choosing the chain to match and then clicking
Apply). A given chain cannot be matched to two different
reference chains simultaneously, and chains from the same structure
(atomic model)
cannot simultaneously serve as a reference chain and a chain to match.
Also restrict to selection allows ignoring
residues of the reference and/or match structures that are not
selected.
In general, restriction should only be used in specific cases
to suppress results that would otherwise be obtained.
For example, two chains that would
otherwise align on their N-terminal domains can be forced to align on
their C-terminal domains by selecting the C-terminal domains and
using the restriction option. Otherwise, restriction is not recommended,
because full-length alignments tend to be of higher quality, and
iteration already serves to exclude
poorly superimposed regions from the final fit.
Although unselected parts of matched chains will appear in the resulting
sequence alignment (if shown), they have simply
been added back in as “filler,” without consideration of
how the characters align, after alignment and matching of only the
selected residues.
[back to top: Matchmaker]
Alignment
- Show pairwise sequence alignment(s)
(initial default off) – whether to display the resulting pairwise
reference-match sequence alignments; each will be shown in a separate
Sequence Viewer window.
When fit iteration is employed,
the pairs used in the final fit will be shown in the alignment as a
region named
matched residues.
An RMSD header is automatically
shown above the sequences, with bar heights
representing the spatial variation among residues associated with a column.
*These sequence alignments are a by-product of superposition,
and may not be entirely correct.
Successful superposition only requires these alignments to be partly
correct, as incorrect portions tend to be excluded from the fit during
iteration.
If the sequences are easy to align (highly similar),
the sequence alignments are likely to be correct throughout.
However, if the sequences are more distantly related,
parts of the alignments may be incorrect even when the superposition is good.
**When the fit has been restricted to selected residues,
the unselected residues of matched chains will still appear in the alignment,
but merely as a convenient compact representation; how they are aligned
is not meaningful.
- Sequence alignment algorithm:
- Needleman-Wunsch (initial default) – global
- Smith-Waterman – local
Sequence alignment scores may include contributions from residue similarity,
secondary structure, and gap penalties.
- Matrix (initial default BLOSUM-62)
– which substitution matrix
to use for the residue similarity part of the score.
If an amino acid matrix is chosen, only peptide sequences
will be aligned; if a nucleic acid matrix is chosen, only
nucleic acid sequences will be aligned. An error message will appear
if there are no reference-match pairs of the appropriate type.
- Gap opening penalty (initial default 12)
– if secondary structure scoring is on,
this parameter is ignored and the
secondary-structure-specific
gap-opening penalties are used instead
- Gap extension penalty (initial default 1)
- Include secondary structure score
(initial default on)
– whether to include a secondary structure term in the score,
with additional parameters:
- Compute secondary structure assignments
(default on)
– whether to first identify helices and strands by running the
dssp algorithm; may improve
superposition by generating consistent assignments, as pre-existing
assignments may reflect the use of different criteria on different structures
- Overwrite previous assignments (initial default off)
– whether to overwrite pre-existing secondary structure assignments
with the newly computed ones. Otherwise, the new assignments are used only
temporarily for superposition purposes.
- Secondary structure weighting (initial default 0.30)
– fractional weight f of the secondary structure contribution
to the overall score, with (1 – f) used to weight the
residue similarity contribution. For example, a value of means:
total score = 0.30(secondary structure score) + 0.70(residue similarity score)
– gap penalties
Setting the slider to 0.0 is not the same as turning off
secondary-structure scoring.
When the option is on, the secondary-structure-specific gap opening penalties
are used regardless of the slider position.
The values in the secondary-structure Scoring matrix
(for all pairwise combinations of H helix, S strand, and
O other) and the secondary-structure-specific gap opening penalties
(Intra-helix, Intra-strand, and Any other) can be adjusted.
[back to top: Matchmaker]
Fitting
Fitting uses one point per residue: CA atoms in amino acids and
C4' atoms in nucleic acids. If a nucleic acid residue lacks a C4' atom
(some lower-resolution structures are P traces),
its P atom will be paired with the P atom of the aligned residue.
-
Iterate by pruning long atom pairs
(initial default on)
– whether to iteratively remove far-apart residue pairs from
the “match list” used to superimpose the structures. This does not
change the initial sequence alignment, but restricts which columns of
that alignment will be used in the final fit.
Otherwise, all of the columns containing both sequences
(i.e. without a gap) will be used. In each cycle of iteration,
atom pairs are removed from the match list and the remaining
pairs are fitted, until no matched pair is
more than the iteration cutoff distance apart.
The atom pairs removed are either the 10% farthest apart of all pairs
or the 50% farthest apart of all pairs exceeding the cutoff, whichever
is the lesser number of pairs.
Iteration tends to exclude sequence-aligned but conformationally dissimilar
regions such as flexible loops, allowing a tighter fit of the
best-matching "core" regions.
- Iteration cutoff distance
(initial default 2.0 Å)
- Verbose logging (initial default off)
– whether to send additional information to the
Log for each chain-chain pair:
- Sequences:
followed by the pairwise sequence alignment, i.e., two lines,
each containing a sequence name and (gapped) sequence
- Residues:
followed by two lines, each a comma-separated list of the structure residues
associated with the nongap positions of the corresponding sequence;
missing structure residues are reported as None
- Residue usage in match (1=used, 0=unused):
followed by two lines, each a comma-separated list of zeros and ones,
indicating which structure residues were used in the final fit
- Log transformation matrix
(initial default off)
– whether to show the final-fit transformation matrix (or matrices)
in the Log
-
If one model being matched, also move these models along with it
– if only one match model is designated in the top section
of the dialog, one or more additional models to move along with it
can be chosen from the model list
UCSF Resource for Biocomputing, Visualization, and Informatics /
April 2023