Clustal W (Thompson et al. 1994) is a program for global multiple sequence alignment. It uses a progressive alignment algorithm with affine gap penalties and a guide tree based on sequence similarity to align DNA or amino acid sequences. The affine gap cost model penalizes insertions and deletions using a linear function in which one term is length independent, and the other is length dependent. Gap penalty = Gapopen + Len * Gapextend. There are several recent reviews comparing multiple alignment algorithms (e.g., Hickson et al. 2000, Thompson et al. 1999, and McClure et al. 1994). Morrison and Ellis (1997) discuss the effects of nucleotide sequence alignment on the estimation of phylogenetic hypotheses. The current version is Clustal W2 (Larkin et al. 2007). The program is also available with a graphical user interface, Clustal X.

The method Clustal uses to construct the alignment is called pairwise progressive sequence alignment. This heuristic method first does a pairwise sequence alignment for all the sequence pairs that can be constructed from the sequence set. A dendrogram (guide tree) of the sequences is then done according to the pairwise similarity of the sequences. Finally a multiple sequence alignment is constructed by aligning sequences in the order defined by the guide tree.


