Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation

C D Livingstone; G J Barton

doi:10.1093/bioinformatics/9.6.745

Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation

Comput Appl Biosci. 1993 Dec;9(6):745-56. doi: 10.1093/bioinformatics/9.6.745.

Authors

C D Livingstone¹, G J Barton

Affiliation

¹ Laboratory of Molecular Biophysics, University of Oxford, UK.

PMID: 8143162
DOI: 10.1093/bioinformatics/9.6.745

Abstract

An algorithm is described for the systematic characterization of the physico-chemical properties seen at each position in a multiple protein sequence alignment. The new algorithm allows questions important in the design of mutagenesis experiments to be quickly answered since positions in the alignment that show unusual or interesting residue substitution patterns may be rapidly identified. The strategy is based on a flexible set-based description of amino acid properties, which is used to define the conservation between any group of amino acids. Sequences in the alignment are gathered into subgroups on the basis of sequence similarity, functional, evolutionary or other criteria. All pairs of subgroups are then compared to highlight positions that confer the unique features of each subgroup. The algorithm is encoded in the computer program AMAS (Analysis of Multiply Aligned Sequences) which provides a textual summary of the analysis and an annotated (boxed, shaded and/or coloured) multiple sequence alignment. The algorithm is illustrated by application to an alignment of 67 SH2 domains where patterns of conserved hydrophobic residues that constitute the protein core are highlighted. The analysis of charge conservation across annexin domains identifies the locations at which conserved charges change sign. The algorithm simplifies the analysis of multiple sequence data by condensing the mass of information present, and thus allows the rapid identification of substitutions of structural and functional importance.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Amino Acid Sequence
Chemical Phenomena
Chemistry, Physical
Conserved Sequence
Molecular Sequence Data
Proteins / chemistry*
Proteins / genetics*
Sequence Alignment / methods*
Sequence Alignment / statistics & numerical data
Sequence Homology, Amino Acid
Software*

Substances

Proteins