Документ взят из кэша поисковой машины. Адрес оригинального документа : http://mouse.belozersky.msu.ru/tools/svetka/articles/fours.html
Дата изменения: Fri Jun 9 07:23:11 2006
Дата индексирования: Tue Feb 5 10:46:11 2013
Кодировка:
Inquiry information || Rule of four sequences
Lebanonian Cedar Some additional inquiry information
Left arrow Back to the help page Back to the description of this program Left arrow
Right arrow Rule of fours

The sequences in alignment can be casual. Though it can't be so, even in this awful case distences between them should satisfy several rule. One of the simple and both important rules is the following.

Fours

Imagine that four sequences (they are called "taxons" in phylogeny) are united into a tree. Knots E and F are, generally speaking, not known. But, accordingly to phylogeny, they must exist (even in conceptual formed). It is considered, that, during molecular evolution, taxons A and B evaluated from the taxon E; analogously, taxons C and D evaluated from the taxon F. As distance between sequences is a kind of measure of evolutionary changes, and, besides, it is an additional quantity, we can say, that:
|AC|=|AE|+|EF|+|FC|,
|AD|=|AE|+|EF|+|FD|;
|BC|=|BE|+|EF|+|FC|,
|BD|=|BE|+|EF|+|FD|.

From here it follows, that
|AC|+|BD|=|AE|+|EF|+|FC|+|BE|+|EF|+|FD|=|AE|+|EF|+|FD|+|BE|+|EF|+|FC|=|AD|+|BC|.

Also we can mention that
|AB|=|AE|+|BE|, |CD|=|FC|+|FD|,
|AB|+|CD|=|AE|+|BE|+|FC|+|FD|<|AC|+|BD|.


This is the rule of four sequences.
Really, this rule is almost never satisfied. First of all, because of different fluctuations (e. g., one amino acid in a sequence is replaced by some other). But we can tell, that the more the alignment tries to satisfy the rule of fours, the better it is. Especially for this it is decided to enter a measure of alignment of quality.
If we take arbitrary 4 sequences, we don't exactly know, which of then accord to A, B, C, D from the scheme above (the main difference between these four taxones is that taxones A and B evaluate from one knot; to the opportunity, pairs of taxones A - C and A - D are separated by 2 knots and three "bridges"). But, according to the rule of fours, we can take 3 sums of distances between sequences (there are six distances: |AB|, |AC|, |AD|, |BC|, |BD|, |CD|). Two greatest of these sums must be equal; the third sum must be less (the difference between this and every of other sums is equal to the quantity of 2*|EF| on the scheme above). The more exactly this rule is satisfied, the better is the alignment considered.
For defining, how these 3 pair sums are distributed, we entered a special parameter of quality. This parameter in linearly connected with differences between sums; it is distributed in the interval [-1; 1]. The more is the meaning of this parameter, the more exactly the rule or fours is satisfied, and, hence, the better the considered alignment is.

Arrow upstairs
Upstairs
When mistakes or interesting facts are found, please tell!