MUSTANG

MUltiple (protein) STructural AligNment alGorithm
Reference: A. S. Konagurthu,  J. C. Whisstock,  P. J. Stuckey,  A. M. Lesk, Proteins: Structure, Function, and Bioinformatics 64(3):559-574 (August 2006) [preprint]

MUSTANG source

Download

Source Code (v3.2.3)
md5sum: 8ace0194a374a60dbce15fa0b1579b44
Patch file to version (v3.2.2)
Instructions for building MUSTANG

Dependencies: GNUMake or equivalent. A modern C++ compiler. MUSTANG is known to build with G++ >=4.1.2. If these dependencies are met, follow these instructions:

  1. Download the source code from the link above.
  2. Extract the archive with:
    tar -zxvf mustang_v3.2.3.tgz
  3. Type: cd MUSTANG_v3.2.3
  4. Build MUSTANG with: make
  5. The built binary will appear in the bin/ subdirectory.
  6. Test the binary by running from MUSTANG_v3.2.3/ directory:
    ./bin/mustang-3.2.3 -f data/test/test_zf-CCHH
  7. This should produce the following files:
    • results.html
    • results.html
Copyright license

 Copyright (c) 2005-, Arun Konagurthu, Monash University.
 arun DOT konagurthu AT monash DOT edu
 http://lcb.infotech.monash.edu.au/mustang
 All rights reserved.

 Redistribution and use in source and binary forms, with or without modification,
 are permitted provided that the following conditions are met:

 * Redistributions of source code must retain the above copyright notice, this
 list of conditions and the following disclaimer.
 * Redistributions in binary form must reproduce the above copyright notice, this
 list of conditions and the following disclaimer in the documentation and/or
 other materials provided with the distribution.
 * Neither the name of the University of Melbourne nor the names of its
 contributors may be used to endorse or promote products derived from this
 software without specific prior written permission.

 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
 ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 LOSS OF USE, DATA, OR PROFITS; OR BUSINESSINTERRUPTION) HOWEVER CAUSED AND ON
 ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
	  
Bug reports

Contact Arun Konagurthu for bug reports, web page errors, or questions.

 

Command line options

Mustang is a program that implements an algorithm for structural
alignment of multiple protein structures. Given a set of PDB files, the
program uses the spatial information in the Calpha atoms of the set to
produce a sequence alignment.  Based on a progressive pairwise heuristic
the algorithm then proceeds through a number of refinement
passes. Mustang reports the multiple sequence alignment and the
corresponding superposition of structures.

To keep the command line short the user can write the path and file names into a (description) file and supply the description file at the command line using the '-f' option. For example see the file used to test the installation: '/usr/share/doc/mustang/examples/test_zf-CCHH'.

PATH should have a prefix '>'. When the program parses this file, it looks for the line starting with '>' symbol (whitespaces are ignored before and after the symbol). The PATH containing the PDB files of the structures to be aligned should follow. See for example: /usr/share/doc/mustang/examples/test_zf-CCHH'.

FILENAMES should have a prefix '+' (whitespaces are ignored before and after this symbol). If PATH is specified then only the filenames should be provided after the '+' symbol. However, if PATH line is NOT provided, then the absolute/relative paths of the structure files should be provided.

The description file format is described further under the -f CmdLine option).  

OPTIONS

-p <path>
Path to the directory holding the (PDB) structures to be aligned.
-i <struct-1> <struct-2>...
Input structures to be aligned. Note: if -p option is used in the command line, supply only the file names of the structures; if not give the absolute/relative path of each of the input structures.
-f <description file>
This option is used to AVOID entering the path (-p) and file name (-i) details in the command line. Instead, to keep the command line short, the user can enter the path and file name details in a "description" file and supply it in the command line. The format of the "description file" is furher discussed in the 'DESCRIPTION' section above. Note: the options { -p , -i} and {-f} are mutually exclusive.
-o <output identifier>
A common identifier for various outputs of the program. Appropriate extentions (e.g. <identifier>.html, <identifier>.pdb, <identifier>.msf) will be added to this identifier depending on the options the user specifies in the command line. DEFAULT output identifier: 'results'
-F <format>
Alignment output format. The choices for <format> are: 'html', 'fasta', 'pir', 'msf'. DEFAULT format: 'html'
-D [CA-CA diameter]
Produce an HTML file where the the residues are reported in lower case with grey background when the aligned(superposed) CA-CA diamter of residues in a column of alignment is > the CA-CA diameter threshold.
-s [<ON>/<OFF>]
Generate a PDB file containing optimal superposition of all the structures based on the alignment. DEFAULT: 'ON'.
-r [<ON>/<OFF>]
Print to a file rmsd table of multiple superposition along with rotation matrix and translation vector corresponding to each input structure. DEFAULT: 'OFF'.
--help
Display a help message and exits.
--version
Output version information and exits.

 

REFERENCE

A. S. Konagurthu, J. Whisstock, P. J. Stuckey, and A. M. Lesk, MUSTANG: A multiple structural alignment algorithm, Proteins, 64(3) 559-574 (2006).