BIS2Analyzer is a tool for the online analysis of coevolving amino-acid pairs in protein sequences, and for the identification of residue networks. It has been especially designed for vertebrate and viral proteins. It is based on BIS2, a reimplemented version of BIS [Dib et al., 2012] conserving the same behaviour but being significantly faster.
For many proteins, characteristic of vertebrate or viral species, most coevolution methods are not applicable because of the reduced number of sequences (either coming from species or from populations) and their conservation. Statistical approaches, asking for hundreds or thousands of divergent sequences and estimating the "background noise" in these sequences, cannot be applied and alternative paradigms should be followed. BIS/BIS2 overcome these difficulties making large-scale coevolution studies a feasible goal. It is a fast algorithm for the coevolution analysis of relatively small sets of sequences (where "small" means < 50 sequences) displaying high similarity.
Another important aspect of BIS/BIS2 is that it allows the user to take into consideration small protein fragments, and not just residues. Indeed, small protein fragments often enter in physical contact one with the other, within a protein or between proteins, and play a major structural or functional biological role. As for single residues, a mutation on one of these fragments usually implies a compensatory mutation in another. Small protein fragments can be used as basic building blocks to reconstruct networks of coevolved amino acids in proteins.
BIS/BIS2 coevolution analysis detects networks of residues/fragments interaction and highlights a high order organisation of residues/fragments demonstrating the importance of studying at a deeper level this structure.
The coevolution analysis method BIS, and its re-implementation BIS2 used in this webserver, start from a coevolution analysis of a pool of aligned protein sequences, provides a score of coevolution for each pair of positions in the sequence alignment, and clusters together those positions that display similar scores of coevolution with all other positions in the alignment. The clustering step allows to group together those residues that exhibit co-evolution during sequence evolution. BIS2 can be applied on a single protein, on a pair of proteins and also on multiple proteins at once.
BIS/BIS2 rely on a combinatorial rather than a statistical framework, thus can work on very small and/or highly conserved sets of sequences. Its behaviour strongly depends on the number of exceptions allowed. For a given position i in the multiple sequence alignment, an exception is an amino acid occurring only once in the corresponding column.
Given an alignment of protein sequences and an integer D representing the maximal number of "errors/exceptions" admitted for capturing the coevolution signals D, the BIS method iteratively computes coevolution of either blocks or positions, for all dimensions d such that d ≤ D.
Coevolution analysis highlights the combined functional or structural role of groups of residues, possibly organized in fragments, in a protein and can suggest complex combinations (pairs, triplets or tuples) of amino acid mutations to be tested by experiments studying protein structural stability or functional activity.
The definition of a precise mathematical framework to analyze coevolution signals allowed us to demonstrate that fragments, and not just isolated residues, are the coevolving units for pools of conserved sequences. In vitro experiments revealed that these fragments play important functional and structural roles for protein families.
BIS/BIS2 have been tested on viral as well as human proteins. Conservation and coevolution patterns can be simultaneously detected [Dib et al., 2012]. Coevolution signal on viral proteins took advantage of splitting the analysis by genotype [Champeimont et al., 2016] and gather complementary information.