Tool: Matchmaker

The Matchmaker tool superimposes protein or nucleic acid structures by first creating a pairwise sequence alignment, then fitting the aligned residue pairs. It is also implemented as the matchmaker command.

Residue types and/or protein secondary structure information can be used to align the sequences, and the pairwise alignment can be shown in the Sequence Viewer. (Conversely, a sequence alignment already open in the Sequence Viewer can be used to guide the superposition of associated structures.)

Fitting uses one point per residue: CA in amino acid residues and C4' in nucleic acid residues. If a nucleic acid residue lacks a C4' atom (some lower-resolution structures are P traces), its P atom will be paired with the P atom of the aligned residue. To use a different set of atoms, including those not in biopolymer chains, see the align command instead. See also: Fit in Map, morph, view, dssp, measure rotation, save PDB

The method was originally implemented in Chimera, as described in:

Tools for integrated sequence-structure analysis with UCSF Chimera. Meng EC, Pettersen EF, Couch GS, Huang CC, Ferrin TE. BMC Bioinformatics. 2006 Jul 12;7:339.

Matchmaker can be opened from the Structure Analysis section of the Tools menu and manipulated like other panels (more...). It contains three tabbed sections, explained in detail below:

Chain pairing
Alignment
Fitting

The following three buttons relate to the preferences, and there is a checkbox option as to whether they should apply to all three tabbed sections or only the one that is currently shown:

Save saves the current Matchmaker parameters as preferences
Reset resets the dialog to the factory default parameter settings without changing any preferences
Restore populates the dialog with the last saved preferences

Clicking OK or Apply will start the calculations with or without closing the dialog, respectively. Close simply closes the dialog, while Help opens this page in the Help Viewer.

Sequence alignment scores, parameter values, and root-mean-square deviations (RMSD values) will be reported in the Log (see also verbose logging). If the fit is iterated, the final RMSD over all residue pairs (columns in the sequence alignment) will be reported along with the RMSD over the pruned set of pairs.

[back to top: Matchmaker]

Chain Pairing

The chain-pairing method dictates what choices of structure are available in the top section of the dialog.

Best-aligning pair of chains between reference and match structure (initial default) – One reference structure and one or more structures to match should be chosen. For each structure to be matched, the reference-match pair of chains with the highest sequence alignment score will be used.
Specific chain in reference structure and best-aligning chain in match structure – One reference chain and one or more structures to match should be chosen. For each structure to be matched, the chain that aligns to the reference chain with the highest sequence alignment score will be used.
Specific chain(s) in reference structure with specific chain(s) in match structure – One or more reference chains should be chosen from the list. For each reference chain chosen, one chain to be matched should be chosen from the corresponding pulldown menu. If multiple chains are to be matched to the same reference chain, it is necessary to match them in separate steps (by choosing the chain to match and then clicking Apply). A given chain cannot be matched to two different reference chains simultaneously, and chains from the same structure (atomic model) cannot simultaneously serve as a reference chain and a chain to match.

Also restrict to selection allows ignoring residues of the reference and/or match structures that are not selected. In general, restriction should only be used in specific cases to suppress results that would otherwise be obtained. For example, two chains that would otherwise align on their N-terminal domains can be forced to align on their C-terminal domains by selecting the C-terminal domains and using the restriction option. Otherwise, restriction is not recommended, because full-length alignments tend to be of higher quality, and iteration already serves to exclude poorly superimposed regions from the final fit. Although unselected parts of matched chains will appear in the resulting sequence alignment (if shown), they have simply been added back in as “filler,” without consideration of how the characters align, after alignment and matching of only the selected residues.

[back to top: Matchmaker]

Alignment

Show pairwise sequence alignment(s) (initial default off) – whether to display the resulting pairwise reference-match sequence alignments; each will be shown in a separate Sequence Viewer window. When fit iteration is employed, the pairs used in the final fit will be shown in the alignment as a region named matched residues. An RMSD header is automatically shown above the sequences, with bar heights representing the spatial variation among residues associated with a column.
*These sequence alignments are a by-product of superposition, and may not be entirely correct. Successful superposition only requires these alignments to be partly correct, as incorrect portions tend to be excluded from the fit during iteration. If the sequences are easy to align (highly similar), the sequence alignments are likely to be correct throughout. However, if the sequences are more distantly related, parts of the alignments may be incorrect even when the superposition is good.
**When the fit has been restricted to selected residues, the unselected residues of matched chains will still appear in the alignment, but merely as a convenient compact representation; how they are aligned is not meaningful.
Sequence alignment algorithm:
- Needleman-Wunsch (initial default) – global
- Smith-Waterman – local

Sequence alignment scores may include contributions from residue similarity, secondary structure, and gap penalties.

Matrix (initial default BLOSUM-62) – which substitution matrix to use for the residue similarity part of the score. If an amino acid matrix is chosen, only peptide sequences will be aligned; if a nucleic acid matrix is chosen, only nucleic acid sequences will be aligned. An error message will appear if there are no reference-match pairs of the appropriate type.
Gap opening penalty (initial default 12) – if secondary structure scoring is on, this parameter is ignored and the secondary-structure-specific gap-opening penalties are used instead
Gap extension penalty (initial default 1)
Include secondary structure score (initial default on) – whether to include a secondary structure term in the score, with additional parameters:
- Compute secondary structure assignments (default on) – whether to first identify helices and strands by running the dssp algorithm; may improve superposition by generating consistent assignments, as pre-existing assignments may reflect the use of different criteria on different structures
- Overwrite previous assignments (initial default off) – whether to overwrite pre-existing secondary structure assignments with the newly computed ones. Otherwise, the new assignments are used only temporarily for superposition purposes.
- Secondary structure weighting (initial default 0.30) – fractional weight f of the secondary structure contribution to the overall score, with (1 – f) used to weight the residue similarity contribution. For example, a value of means:
  
  total score = 0.30(secondary structure score) + 0.70(residue similarity score) – gap penalties
  
  Setting the slider to 0.0 is not the same as turning off secondary-structure scoring. When the option is on, the secondary-structure-specific gap opening penalties are used regardless of the slider position.
  The values in the secondary-structure Scoring matrix (for all pairwise combinations of H helix, S strand, and O other) and the secondary-structure-specific gap opening penalties (Intra-helix, Intra-strand, and Any other) can be adjusted.

[back to top: Matchmaker]

Fitting

Fitting uses one point per residue: CA atoms in amino acids and C4' atoms in nucleic acids. If a nucleic acid residue lacks a C4' atom (some lower-resolution structures are P traces), its P atom will be paired with the P atom of the aligned residue.

Iterate by pruning long atom pairs (initial default on) – whether to iteratively remove far-apart residue pairs from the “match list” used to superimpose the structures. This does not change the initial sequence alignment, but restricts which columns of that alignment will be used in the final fit. Otherwise, all of the columns containing both sequences (i.e. without a gap) will be used. In each cycle of iteration, atom pairs are removed from the match list and the remaining pairs are fitted, until no matched pair is more than the iteration cutoff distance apart. The atom pairs removed are either the 10% farthest apart of all pairs or the 50% farthest apart of all pairs exceeding the cutoff, whichever is the lesser number of pairs. Iteration tends to exclude sequence-aligned but conformationally dissimilar regions such as flexible loops, allowing a tighter fit of the best-matching "core" regions.
Iteration cutoff distance (initial default 2.0 Å)
Verbose logging (initial default off) – whether to send additional information to the Log for each chain-chain pair:
- Sequences: followed by the pairwise sequence alignment, i.e., two lines, each containing a sequence name and (gapped) sequence
- Residues: followed by two lines, each a comma-separated list of the structure residues associated with the nongap positions of the corresponding sequence; missing structure residues are reported as None
- Residue usage in match (1=used, 0=unused): followed by two lines, each a comma-separated list of zeros and ones, indicating which structure residues were used in the final fit
Log transformation matrix (initial default off) – whether to show the final-fit transformation matrix (or matrices) in the Log
Log parameter values (initial default on) – whether to report the run parameters in the Log
If one model being matched, also move these models along with it – if only one match model is designated in the top section of the dialog, one or more additional models to move along with it can be chosen from the model list

UCSF Resource for Biocomputing, Visualization, and Informatics / March 2025