Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://storage.bioinf.fbb.msu.ru/~roman/mol_biol_rus_2006.pdf
Äàòà èçìåíåíèÿ: Tue May 2 13:04:56 2006
Äàòà èíäåêñèðîâàíèÿ: Sat Feb 2 21:55:44 2013
Êîäèðîâêà: ISO8859-5
õùóæôüó?ú÷Ô? ÕðùóùÙð?, 2006, êãß 40, < 3, â. 541-545

õÔýæõÔýð?æûôÔ? ð ûðûýæõ÷Ô? ÕðùóùÙð?
üáô 577.1

׿úù?ý÷ùûý÷?ò õæýùá øúæáûôÔñÔ÷ð? ýúÔ÷ûõæõÕúÔ÷÷?ÿ ü?Ôûýôù× øù õ÷ùìæûý׿÷÷ùõü ×?úÔ×÷ð×Ô÷ð? Ôõð÷ùôðûóùý÷?ÿ øùûóæáù×Ôýæó?÷ùûýæò
? 2006 ,,. ú. Ô. ûëêãßÛÜ1*, Ô. Ô. õÛãÜã,1,
1 2ðÜâêÛêëê 2, 3

Ùãâëâê,ÒÜÜÀØ Üë~ÜÀØ ^ÒÜê "Ùãâ÷ðð,,ÒÜÒêÛÝ", õãâÝ,, 117545 äãÇÞÒß äÒÒ~Û ÛÜéãß^ÛÛ úãââÛØâÝãØ ÝÒßÛÛ ÜëÝ, õãâÝ,, 127994 3þÝëÞÈêÒê ÇÛãÛÜéãßêÛÝÛ Û ÇÛãÛÜÚÒÜÒÛÛ õãâÝã,âÝã,,ã ,,ãâëâê,ÒÜÜã,,ã ëÜÛ,ÒâÛêÒê Ûß. õ.×. óãßãÜãâã,, õãâÝ,, 119992
øãâêëäÛÞ , ÒÝ^Û 25.01.2006 ,,.

øÒâÝÑÜÛÒ äãÞãÚÒÜÛØ ßÒßÇÜÜÀi ë~âêÝã, Ü äãâÞÒã,êÒÞÈÜãâêfli ßÒßÇÜÜÀi ÇÒÞÝã, - ,Üã ÛÑ,ÒâêÜfl Û, ÇÒÑëâÞã,Üã, ,ÚÜfl Ñ~, êã~ÜãâêÈ Ò?ÒÜÛfl ÝãêããØ ßÒêãßÛ, ÜÒ ÛâäãÞÈÑëÛßÛ ,,ãßãÞã,,Û~Üã,,ã äãÛâÝ äã ãäãÞÜÛêÒÞÈÜãßë ÇÜÝë ÜÜÀi, ßãÚÒê ÇÀêÈ ëÞë~?ÒÜ. × ÜÜãØ ãÇÞâêÛ ãëÒêâfl ÜÒi,êÝ êÒâêã,Ài ÜÜÀi ÛÑ-Ñ ßÞã,,ã ãÇÒß ÒÞÈÜÀi âêëÝêëÜÀi ÜÜÀi äã ßÒßÇÜÜÀß ÇÒÞÝß. × ÜÜãØ ÇãêÒ âéãßÛã,Ü êÒâêã,fl ,ÀÇãÝ âêëÝêëÜÀi ,À,ÜÛ,ÜÛØ ßÒßÇÜÜÀi ÇÒÞÝã,, ÑßÒêÝ ÝãêãÀi fl,ÞflÒêâfl âã,,Þâã,ÜÛÒß ÛÜéãß^ÛÛ ãÇ ÛÑ,ÒâêÜÀi êÒißÒÜÀi âêëÝêëi, ,iãflÛi , ,À,ÜÛ,ÜÛfl ßÛÜãÝÛâÞãêÜÀi äãâÞÒã,êÒÞÈÜãâêÒØ ÇÒÞÝã,. øÒÞ,,Òêâfl ßÒêã äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ ,À,ÜÛ,ÜÛfl, ÛâäãÞÈÑëÛØ Þ,,ãÛêß Forward-backward ÛÑ êÒãÛÛ âÝÀêÀi ßÝã,âÝÛi ßãÒÞÒØ. õÒêã äãÑ,ãÞflÒê ÜÒ êãÞÈÝã äÒâÝÑÀ,êÈ äãÞãÚÒÜÛfl ßÒßÇÜÜÀi ë~âêÝã,, Üã Û éãßÛã,êÈ ,ÒãflêÜãâêÜÀØ ßÒßÇÜÜÀØ äãéÛÞÈ, ÝãêãÀØ ßãÚÒê ÇÀêÈ ÛâäãÞÈÑã,Ü , ÞÈÜÒØ?Òß , ßÒêãi ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl, ë~ÛêÀ,Ûi ÛÜéãß^Û ã ,êãÛ~ÜãØ âêëÝêëÒ äãâÞÒã,êÒÞÈÜãâêÒØ. õÒêã ÒÞÛÑã,Ü , ÝãßäÈêÒÜãØ äã,,ßßÒ, ãâêëäÜãØ , Internet äã Òâë http://bioinf.fbb.msu.ru/fwdbck/. øÒÞãÚÒÜÜÀØ ßÒêã Òê ÒÑëÞÈêêÀ, Þë~?ÛÒ, ~Òß ßÒêã MEMSAT, fl,ÞflÛØâfl ~ëêÈ ÞÛ ÜÒ ÒÛÜâê,ÒÜÜÀß ßÒêããß äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl ÇÒÑ ÛâäãÞÈÑã,ÜÛfl ,,ãßãÞã,,Û~Üã,,ã äãÛâÝ. ôÞ~Ò,ÀÒ âÞã,: ßÒßÇÜÜÀØ ÇÒÞãÝ, äÒâÝÑÜÛÒ ,êãÛ~ÜãØ âêëÝêëÀ, âÝÀêÀÒ ßÝã,âÝÛÒ ßãÒÞÛ, Þ,,ãÛêß forward-backward, ,ÒãflêÜãâêÜÀØ ßÒßÇÜÜÀØ äãéÛÞÈ, êÒâêã,fl ,ÀÇãÝ. MEMBRANE PROBABILITY PROFILE CONSTRUCTION BASED ON AMINO ACIDS SEQUENCES MULTIPLE ALIGNMENT, by R. A. Sutormin1*, A. A. Mironov1, 2, 3 (1State Scientific Center GosNIIGenetika, Moscow, 113545 Russia, *e-mail: sutor_ra@mail.ru; 2Institute for Information Transmission Problems, Moscow, 127994 Russia; 3Department of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119992 Russia). Prediction of membrane segments in sequences of membrane proteins is well known and important problem. Accuracy of the solution of this problem by methods that don't use homology search in additional data bank can be improved. There is a lack of testing data in this area because of small amount of real structures of membrane proteins. In this work, we create a testing set of structural alignments of membrane proteins, in which positioning of the membrane segments reflects agreement of known 3D-structures of proteins in the alignment. We propose a method for predicting position of membrane segments in multiple alignment based on forward-backward algorithm from HMM theory. This method not only allows to predict positions of membrane segments but also forms probability membrane profile, which can be used in multiple alignment methods that take into account secondary structure information about sequences. Method is implemented in computer program available on the World-Wide Web site http://bioinf.fbb.msu.ru/fwdbck/. Proposed method provides results better than MEMSAT method, which is nearly only tool for prediction of membrane segments in multiple alignments without additional homology search. Key words: membrane protein, secondary structure prediction, hidden markov models, forward-backward algorithm, probability membrane profile, testing data set.

*?Þ. äã~ê: sutor_ra@mail.ru

541


542

ûëêãßÛÜ, õÛãÜã,

õÜã,,ÛÒ Ñ~Û ÇÛãÛÜéãßêÛÝÛ ,ÝÞ~ê âêÛ ,À,ÜÛ,ÜÛfl ßÛÜãÝÛâÞãêÜÀi äãâÞÒã,êÒÞÈÜãâêÒØ ÇÒÞÝã, [1]. øãêãßë Ý~Òâê,ã äãâêãÒÜÛfl ,À,ÜÛ,ÜÛØ ~âêã fl,ÞflÒêâfl ÝÛêÛ~ÒâÝÛß éÝêããß äÛ ÜÞÛÑÒ äãâÞÒã,êÒÞÈÜãâêÒØ [2]. × âÞë~Ò ßÒßÇÜÜÀi ÇÒÞÝã, äã^Òââ ëâêÜã,ÞÒÜÛfl êÒißÒÜÀi âêëÝêë ,âêÒ~Òê ãâêêã~Üã ßÜã,,ã êëÜãâêÒØ [3], Û ÝÛâêÞÞã,,éÛ~ÒâÝÛi ÜÜÀi äã ÜÛß ÛÑ,ÒâêÜã ÜÒßÜã,,ã. ×âÞÒâê,ÛÒ êã,,ã êëÜã âêãÛêÈ âêëÝêëÜÀÒ ,À,ÜÛ,ÜÛfl Þfl äã,ÒÝÛ ÇãêÀ ,êãßêÛ~ÒâÝÛi ßÒêãã, äãâêãÒÜÛfl ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl [4]. û ë,,ãØ âêããÜÀ, ÒâêÈ ãâÜã,ÜÛfl äãÞ,,êÈ, ~êã ãÇÒäÛÜflêÀÒ ßÒêãÀ ,À,ÜÛ,ÜÛfl ÜÒ ,,ÜêÛëê iãã?Ò,,ã Ý~Òâê, ,âÞÒâê,ÛÒ ÜÒãÇÀ~Üã,,ã Û ÜÒ,ÜãßÒÜã,,ã ßÛÜãÝÛâÞãêÜã,,ã âãâê, ßÒßÇÜÜÀi ÇÒÞÝã, [5]. ÷ÒãÜãÝêÜã ,ÀâÝÑÀ,ÞãâÈ ßÜÒÜÛÒ, ~êã ,ã,ÞÒ~ÒÜÛÒ , äã^Òââ äãâêãÒÜÛfl ,À,ÜÛ,ÜÛfl ÛÜéãß^ÛÛ ã ,êãÛ~ÜãØ âêëÝêëÒ ÇÒÞÝã, ãÞÚÜã äÛ,ãÛêÈ Ý ëÞë~?ÒÜÛ Ý~Òâê, ,À,ÜÛ,ÜÛfl [6]. × âÞë~Ò ßÒßÇÜÜÀi ÇÒÞÝã, , Ý~Òâê,Ò ,êãÛ~ÜãØ âêëÝêëÀ Þã,,Û~ÜÒÒ ââßêÛ,êÈ ãÇÞâêÛ ÇÒÞÝã,ãØ äãâÞÒã,êÒÞÈÜãâêÛ, ÜiãflÛÒâfl , ßÒßÇÜÒ [7], êÝ ÝÝ ë~âêÝÛ, ÜÒ ÞÒÚÛÒ , ßÒßÇÜÒ, äã â,ãØâê,ß äãiãÚÛ Ü ,,ÞãÇëÞflÜÀÒ ÇÒÞÝÛ, ÝãêãÀÒ, , â,ã ã~ÒÒÈ, ,À,ÜÛ,êâfl ã,ãÞÈÜã iãã?ã [8]. ýÝÛß ãÇÑãß, äÒÚÒ ~Òß ÑÇêÀ,êÈ Þ,,ãÛêß ,À,ÜÛ,ÜÛfl, ë~ÛêÀ,ÛØ ßÒßÇÜÜë ÑßÒêÝë äãâÞÒã,êÒÞÈÜãâêÒØ, êÒÇëÒêâfl Üë~ÛêÈâfl ä,ÛÞÈÜã âêãÛêÈ êë ÑßÒêÝë. ô âãÚÞÒÜÛ, Ý~Òâê,ã ßÒêãã,, äÒâÝÑÀ,Ûi äãÞãÚÒÜÛfl ßÒßÇÜÜÀi ë~âêÝã, äã ßÛÜãÝÛâÞãêÜãØ äãâÞÒã,êÒÞÈÜãâêÛ, ÞÒÝã ãê ÛÒÞÈÜã,,ã [9]. æâêÈ fl ßÒêãã,, êÝÛÒ ÝÝ PHDpsihtm (~âêÈ âÒ,Ò PredictProtein) [10] ÛÞÛ MEMSAT [11] (, ÒÚÛßÒ online), ÝãêãÀÒ âêãflê äÒâÝÑÜÛfl Þfl äãâÞÒã,êÒÞÈÜãâêÛ, ãäÛflâÈ Ü ÒÑëÞÈêêÀ äãÛâÝ ,,ãßãÞã,,Û~ÜÀi äãâÞÒã,êÒÞÈÜãâêÒØ äã ÜÒÝãêããßë ÇÒÞÝã,ãßë ÇÜÝë. × êãØ ÇãêÒ ßÀ âéãÝëâÛã,ÞÛ ,ÜÛßÜÛÒ Ü Ò?ÒÜÛÛ Ñ~Û, ÜÒ äÛÇÒ,,fl Ý ãäãÞÜÛêÒÞÈÜãßë ,,ãßãÞã,,Û~Üãßë äãÛâÝë. ÷ß ÜÒ ëÞãâÈ ÜØêÛ ë,,ã,,ã ßÒêã, ÝãêãÀØ äÒâÝÑÀ,Òê äãÞãÚÒÜÛfl ßÒßÇÜÜÀi ë~âêÝã, Ü ãâÜã,Ò ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl Û ÜÒ êÒÇëÒê Þfl ÇãêÀ ãäãÞÜÛêÒÞÈÜÀi ,,ãßãÞã,,Û~ÜÀi ÜÜÀi, ÝãßÒ ßÒêã MEMSAT (, ÒÚÛßÒ offline). õÒêã ãâÜã,Ü Ü ,ÀÇãÒ äãiãflÒØ êÜâßÒßÇÜÜãØ ßãÒÞÛ â ÛâäãÞÈÑã,ÜÛÒß ÛÜßÛ~ÒâÝã,,ã äã,,ßßÛã,ÜÛfl , âããê,Òêâê,ÛÛ â êÒß, äÒäã~Ûêê ÞÛ ßÛÜãÝÛâÞãêÜÀÒ ãâêêÝÛ âäãÞ,,êÈâfl Ü Ýfli ßÒßÇÜÀ, , âÒÒÛÜÒ ÛÞÛ ,ÜÒ ÒÒ. õÒêã Òê , Ý~Òâê,Ò ÒÑëÞÈêê ÜÒÝãêãë äãâÞÒã,êÒÞÈÜãâêÈ-ßâÝë, ÝÚÀØ âÛß,ãÞ ÝãêããØ ",,ã,ãÛê" ã êãß, ,,Ò Üiãflêâfl ßÛÜãÝÛâÞãêÜÀÒ ãâêêÝÛ ÛÑ âããê,Òêâê,ëÒ,,ã âêãÞÇ^ ,À,ÜÛ,ÜÛfl - , ßÒßÇÜÒ ÛÞÛ âÜëÚÛ. ýÝ ÝÝ ÜÒÞÈÑfl ÇÀêÈ , êã~ÜãâêÛ ë,ÒÒÜÜÀß, ~êã äãÞãÚÒÜÛÒ ßÒßÇÜÜã,,ã ë~âêÝ ÚÒâêÝã éÛÝâÛã,Üã (ÇÒÞÝã,fl

ßãÞÒÝëÞ "À?Ûê", ê.Ò. äãÛâiãflê âÞÇÀÒ ÝãÞÒÇÜÛfl Ñ,ÒÜÈÒ, ^ÒäÛ), êã ÇãÞÒÒ ÒÝ,êÜÀß ÇÀÞã ÇÀ äÒâÝÑÜÛÒ êã,,ã, ÝÝã, ,ÒãflêÜãâêÈ Þfl ÜÜãØ ßÛÜãÝÛâÞãêÀ ÜiãÛêÈâfl , ßÒßÇÜÒ (ÜÑã,Òß êã "ßÒßÇÜÜÀØ ,ÒãflêÜãâêÜÀØ äãéÛÞÈ"). áÞfl ßÛÜãÝÛâÞãê, ÞÒÚÛi ,ÜëêÛ ßÒßÇÜÀ, ê ,ÒãflêÜãâêÈ ãÞÚÜ ÇÀêÈ ,ÀâãÝ, Ü Ýfli ãÜ ãÞÚÜ äÞ,Üã ãäëâÝêÈâfl ã ÜëÞfl. ýÝÛÒ ,ÒãflêÜãâêÛ ßãÚÜã äãÞë~êÈ Ü ãâÜã,Ò äãâêãÒÜÛfl âÝÀêãØ ßÝã,âÝãØ ßãÒÞÛ (HMM) ßÒßÇÜÜÀi Û ,ÜÒ?ÜÛi ãÇÞâêÒØ. ðâäãÞÈÑëfl Þ,,ãÛêß forward-backward [12], ßÀ ßãÚÒß ,À~ÛâÞflêÈ ,ÒãflêÜãâêÈ êã,,ã, ÜâÝãÞÈÝã ê ÛÞÛ ÛÜfl ßÛÜãÝÛâÞãê ëÝÞÀ,Òêâfl , ßÒßÇÜÜë ~âêÈ ßãÒÞÛ. åÒÞÈ ÇãêÀ ÇÀÞã äãâêãÒÜÛÒ ÜÇã ÝÞâêÒã, ßÒßÇÜÜÀi ÇÒÞÝã, â ÛÑ,ÒâêÜãØ êÒißÒÜãØ âêëÝêëãØ, Þfl ÝãêãÀi ßãÚÜã äãâêãÛêÈ ÒÝ,êÜãÒ âêëÝêëÜãÒ ,À,ÜÛ,ÜÛÒ, êÝÚÒ ÑÇãêÝ ßÒêã äãâêãÒÜÛfl ßÒßÇÜÜã,,ã äãéÛÞfl Þfl âêãÞÇ^ã, ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl. õæýùá? õÒêã éãßÛã,ÜÛfl ßÒßÇÜÜã,,ã äãéÛÞfl, ,,Ò ÝÚãßë âêãÞÇ^ë ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl äÛäÛâÀ,Òêâfl ,ÒãflêÜãâêÈ êã,,ã, ~êã ßÛÜãÝÛâÞãêÜÀÒ ãâêêÝÛ êã,,ã âêãÞÇ^ ÞÒÚê , ßÒßÇÜÒ, âãâêãÛê , âÞÒëÒß. ÷ ãâÜã,Ò ßÜãÚÒâê,ÒÜÜã,,ã ,À,ÜÛ,ÜÛfl âêãÛêâfl ~âêãêÜÀØ ßÛÜãÝÛâÞãêÜÀØ äãéÛÞÈ (~âêãêÜfl ßêÛ^). áÞfl êã,,ã âêãÛêâfl ßêÛ^ äãäÜÀi ,ãÞ^ÛãÜÜÀi ââêãflÜÛØ ßÒÚë äãâÞÒã,êÒÞÈÜãâêflßÛ Ü ãâÜã,Ò ,ÒÞÛ~ÛÜ äãäÜã,,ã âiãâê, (identity) ßÒêããß äëââãÜã,âÝãØ ÝãÒÝ^ÛÛ [13]: d = -log((20max{1.1/20, id} - - 1)/19), ,,Ò d - ,ãÞ^ÛãÜÜãÒ ââêãflÜÛÒ, id - ãÞfl âêãÞÇ^ã, ,À,ÜÛ,ÜÛfl â âã,ä,?ÛßÛ ßÛÜãÝÛâÞãêßÛ. áÞÒÒ âêãÛêâfl éÛÞã,,ÒÜÒêÛ~ÒâÝãÒ ÒÒ,ã ßÒêããß ÇÞÛÚØ?Ò,,ã âãâÒ [14] Û ÝÚãØ äãâÞÒã,êÒÞÈÜãâêÛ äÛäÛâÀ,Òêâfl ,Òâ äãâêÀß, Üã ééÒÝêÛ,ÜÀß ßÒêããß, äÒÞãÚÒÜÜÀß ÙÒ?êÒØÜãß Û âã,ê. [15]. ×Òâ ãÇÞê âÞÒëÛß â,ãØâê,ãß. æâÞÛ ë Üâ ÛßÒÒêâfl k ãÛÜÝã,Ài äãâÞÒã,êÒÞÈÜãâêÒØ, êã ãÜÛ äãÞë~ê ,Òâ 1/k, äãâÞÒã,êÒÞÈÜãâêÈ, ÜÒ äãiãÚfl ÜÛ Ü ãÜë ë,,ë, äãÞë~Òê ,Òâ 1. ?âêãêÜÀØ äãéÛÞÈ éãßÛëÒêâfl äëêÒß ëâÒÜÒÜÛfl ,âÒi ÒÛÜÛ~ÜÀi äãéÛÞÒØ äãâÞÒã,êÒÞÈÜãâêÒØ â ë~Òêãß Ûi ,Òâ. áÞfl éãßÛã,ÜÛfl ÒÑëÞÈêê ÛâäãÞÈÑëÒêâfl Þ,,ãÛêß forward-backward Ü ãâÜã,Ò âÝÀêãØ ßÝã,âÝãØ ßãÒÞÛ (HMM), ÜÞã,,Û~ÜãØ êãØ, ~êã ÛâäãÞÈÑã,Ü , âÒ,ÒÒ TMHMM [16]. × êãØ ßãÒÞÛ ÑÞÛ~êâfl âãâêãflÜÛfl Þfl ßÛÜãÝÛâÞãê, ÜiãflÛiâfl , ^ÛêãäÞÑßÒ, Þfl ßÛÜãÝÛâÞãê, âßãêflÛi ÜëÚë ÝÞÒêÝÛ, Û ,ëi äãâÞÒã,êÒÞÈÜãâêÒØ âãâêãflÜÛØ, âããê,Òêâê,ëÛi ÇÒÞÝã,ãØ ^ÒäÛ, äÒÒâÒÝÒØ ßÒßÇÜë ÛÑÜëêÛ ÜëÚë Û ÜãÇããê. ×ÀÒÞflêâfl ,Ò ,,ëääÀ âãâêãflÜÛØ ßãÒÞÛ, äÛiãflÛÒâfl Ü ,,ÜÛ^À ßÒßÇÜÀ. øßÒêÀ ßãÒÞÛ ãÇë~ÒÜÀ Ü ,ÀÇãÝÒ ÑßÒ~ÒÜÜÀi
õùóæôüó?ú÷Ô? ÕðùóùÙð? êãß 40 < 3 2006


׿úù?ý÷ùûý÷?ò õæýùá øúæáûôÔñÔ÷ð? ýúÔ÷ûõæõÕúÔ÷÷?ÿ ü?Ôûýôù×

543

ãÛÜã~ÜÀi äãâÞÒã,êÒÞÈÜãâêÒØ, ãâêëäÜãØ Ü âØêÒ âÒ,Ò TMHMM. ?êãê âÒ,Ò äÒâÝÑÀ,Òê äãÞãÚÒÜÛÒ ßÒßÇÜÜÀi ë~âêÝã, Ü ãÛÜã~ÜãØ äãâÞÒã,êÒÞÈÜãâêÛ Û ÜÒ ëßÒÒê ÇãêêÈ â ßÛÜãÝÛâÞãêÜÀß ~âêãêÜÀß äãéÛÞÒß ÛÞÛ â ßÜãÚÒâê,ÒÜÜÀß ,À,ÜÛ,ÜÛÒß. × Ò,,ã ãâÜã,Ò ÞÒÚÛê Þ,,ãÛêß ×ÛêÒÇÛ (âß. [12]), ÝãêãÀØ äÛßÒÜflÒêâfl Þfl ÜiãÚÒÜÛfl ãäêÛßÞÈÜã,,ã äëêÛ, Üã, , ãêÞÛ~ÛÒ ãê Þ,,ãÛêß forward-backward, ÜÒ ßãÚÒê ÇÀêÈ ÛâäãÞÈÑã,Ü Þfl äãâêãÒÜÛfl ,ÒãflêÜãâêÜã,,ã äãéÛÞfl. øãâêãÒÜÛÒ êÒâêã,ãØ ,ÀÇãÝÛ áÞfl äã,ÒÝÛ ßÒêã âéãßÛã,Ü ,ÀÇãÝ êÞãÜÜÀi ßÜãÚÒâê,ÒÜÜÀi ,À,ÜÛ,ÜÛØ. áÞfl êã,,ã ,ÑflêÀ ,âÒ äãâÞÒã,êÒÞÈÜãâêÛ ßÒßÇÜÜÀi ÇÒÞÝã, â ÛÑ,ÒâêÜãØ äãâêÜâê,ÒÜÜãØ âêëÝêëãØ (442 ÇÒÞÝ) â âØê âÒ,Ò PDBTM [17]. áÞÒÒ ÇÀÞÛ äãâêãÒÜÀ ,âÒ äãäÜÀÒ ,À,ÜÛ,ÜÛfl â ÛâäãÞÈÑã,ÜÛÒß äã,,ßßÀ CLUSTALW [1]. æâÞÛ ,âêÒ~ÞÛ äÀ ÇÒÞÝã, âã âiãâê,ãß (identity) ÜÒ ßÒÜÒÒ 95%, êã ÛÑ ÜÛi ãâê,ÞflÞÛ ãÛÜ. áÞÒÒ äã,ãÛÞÛ ÝÞâêÒÛÑ^Û äã äãäÜãßë âiãâê,ë ßÒêããß ÇÞÛÚØ?Ò,,ã âãâÒ [14] â ÜÛÚÜÛß äãã,,ãß 20%. æâÞÛ ÝÞâêÒ ãÝÑÀ,Þâfl ÑßÒãß ÇãÞÒÒ 20 ÇÒÞÝã,, êã ÜÛÚÜÛØ äãã,, Þfl ÜÒ,,ã äãÜÛßÞÛ ã êÒi äã, äãÝ ãÜ ÜÒ ÑÒÞflÞâfl Ü ßÒÜÈ?ÛÒ ÝÞâêÒÀ. øãâÞÒ êã,,ã ââßêÛ,ÞÛ êãÞÈÝã ÝÞâêÒÀ, âãÒÚÛÒ ÜÒ ßÒÜÒÒ 3 ÇÒÞÝã,. áÞfl ÝÚã,,ã ÝÞâêÒ äã,ãÛÞÛ ßÜãÚÒâê,ÒÜÜãÒ âêëÝêëÜãÒ ,À,ÜÛ,ÜÛÒ êÒißÒÜÀi âêëÝêë ÇÒÞÝã, â ÛâäãÞÈÑã,ÜÛÒß âÒ,Ò MAMMOTH [18]. æâÞÛ Ý~Òâê,ã ,À,ÜÛ,ÜÛfl ÇÀÞã ã~ÒÜÈ ÜÛÑÝÛß (ßÞã âêãÞÇ^ã, ,À,ÜÛ,ÜÛfl, ãâêã,ÒÜÀi â êã~ÝÛ ÑÒÜÛfl ßÒêã), êã ãêÇâÀ,ÞÛ âßÀØ ÞÈÜÛØ äÒâê,ÛêÒÞÈ ÝÞâêÒ Û ÝÞâêÒ ,À,ÜÛ,ÞÛ ,Üã,È. × ÒÑëÞÈêêÒ ÛâäãÞÈÑã,ÜÛfl ÜÜãØ äã^ÒëÀ äãÞë~ÛÞÛ 11 ÝÞâêÒã, ÛÑ 55 ÇÒÞÝã,. áãÞfl âêëÝêëÜã ÜÒÚÜÀi âêãÞÇ^ã, ,À,ÜÛ,ÜÛØ ÜiãÛêâfl , ÛäÑãÜÒ ãê 24 ã 86%, , âÒÜÒß âãâê,Þflfl 63%. úÑßÒ ÝÞâêÒã, iÝêÒÛÑëÒêâfl ÛäÑãÜãß ãê 3 ã 8 ÇÒÞÝã,, âÒÜÛØ ÑßÒ - 5 ÇÒÞÝã,. áÞÒÒ äã,ÒflÞÛ äÛÜÞÒÚÜãâêÈ ÇÒÞÝã, ÝÞâêÒã, Ý âêëÝêëÜÀß âÒßÒØâê,ß äã ÝÞââÛéÛÝ^ÛÛ SCOP [19] Û CATH [20]. × ãÜãß ÝÞâêÒÒ ãÇÜëÚÛÞÛ ,ëiãßÒÜÜë âêëÝêëë, äÛ~Òß ÒâêÈ ÇÒÞÝÛ, , ÝãêãÀi äÛâëêâê,ëÒê êãÞÈÝã ãÛÜ ÛÑ ,ëi ãßÒÜã,. × êÒi ÝÞâêÒi ÛßÒêâfl ÇÒÞÝÛ, âêëÝêëÜÀÒ âÒßÒØâê, ÝãêãÀi , ãÇÒÛi ÝÞââÛéÛÝ^Ûfli ÜÒ ãÇãÑÜ~ÒÜÀ. × ,ëi ÝÞâêÒi äÛâëêâê,ëê ÇÒÞÝÛ ÛÑ ÑÜÀi âÒßÒØâê,, , ãÜãß âÞë~Ò âßÒ?ÜÀ âÒßÒØâê, ÇÝêÒÛãããäâÛÜã, (f.13.1.1 äã SCOP) Û âëÝ^ÛÜêÒ,,Ûã,,ÒÜÑ/éëßêÒëÝêÑ (f.21.2.2 äã SCOP), ,ã ,êããß - âÒßÒØâê, ÇÒÞÝã, ÒÝ^ÛãÜÜã,,ã ^ÒÜê éãêãâÛâêÒßÀ I (f.31.1.1 äã SCOP) Û (c.37.1.12 äã SCOP), ATPÑÜã,,ã ãßÒÜ Ô×û-êÜâäãêÒ.
õùóæôüó?ú÷Ô? ÕðùóùÙð? êãß 40 < 3 2006

20 18 16 14 12 10 8 6 4 2 0

ôãÞÛ~Òâê,ã

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 æÒ áÞÛÜ ßÒßÇÜÜã,,ã fl

ÙÛâêã,,ßß âäÒÒÞÒÜÛfl ÞÛÜ ßÒßÇÜÜÀi flÒ.

øãâêãÒÜÛÒ ãâêã,ÒÜãØ ßÒßÇÜÜãØ ÑßÒêÝÛ × ÝÚãß ÇÒÞÝÒ ÝÚã,,ã ÝÞâêÒ ÑßÒêÛÞÛ ë~âêÝÛ ÇÒÞÝã,ãØ äãâÞÒã,êÒÞÈÜãâêÛ, ÞÒÚÛÒ , ßÒßÇÜÒ, Ü ãâÜã,Ò Þ,,ãÛêß TMDET [21], ãäÒÒÞflÒ,,ã ÜÛÇãÞÒÒ ,ÒãflêÜãÒ äãÞãÚÒÜÛÒ ßÒßÇÜÀ , êÒißÒÜãØ âêëÝêëÒ. áÞfl êã,,ã ~êãÇÀ ÛÑÇÒÚêÈ ã?ÛÇã~ÜãØ ÝÞââÛéÛÝ^ÛÛ ë~âêÝ ÇÒÞÝã,ãØ ^ÒäÛ ÝÝ ßÒßÇÜÜã,,ã ÛÑ-Ñ ÜÒêã~Üã,,ã äÒâÝÑÜÛfl äãÞãÚÒÜÛfl ßÒßÇÜÀ Þ,,ãÛêßãß TMDET, ÇÀÞÛ ,,ÒÒÜÀ "âÒÀÒ" ãÇÞâêÛ äã Ýflß ßÒßÇÜÀ êãÞÛÜãØ , 5 Ü,,âêÒß. æâÞÛ ÝÝãØ-êã ë~âêãÝ ÇÒÞÝã,ãØ ^ÒäÛ ÞÒÚÛê êãÞÈÝã , "âÒãØ" ãÇÞâêÛ, êã ãÜ ÜÒ â~ÛêÒêâfl ßÒßÇÜÜÀß. úÑßÒêÝÛ ÜÜãâÛÞÛ Ü âêëÝêëÜÀÒ ,À,ÜÛ,ÜÛfl, Û Ü êãØ ãâÜã,Ò âéãßÛã,ÞÛ ãÇë ßÒßÇÜÜë ÑßÒêÝë (ßÒßÇÜÜÀÒ fl). × fl ,ã?ÞÛ êÒ ÝãÞãÜÝÛ âêëÝêëÜã,,ã ,À,ÜÛ,ÜÛfl, , ÝãêãÀi ,âÒ ÇÒÑÒÞÒ^ÛãÜÜÀÒ äãÑÛ^ÛÛ äãßÒ~ÒÜÀ ÝÝ ßÒßÇÜÜÀÒ. úëÝã,ãâê,ëflâÈ ,ÀiãÜãØ ÛÜéãß^ÛÒØ âÒ,Ò MAMMOTH ã ãâêã,ÒÜãâêÛ âêëÝêëÜã,,ã ,À,ÜÛ,ÜÛfl , êÒi ÛÞÛ ÛÜÀi âêãÞÇ^i, fl ÑÒÞflÞÛ Ü , ÝÞââ - ÑâÞëÚÛ,ÛÒ ã,ÒÛfl Û ÜÒ ÑâÞëÚÛ,ÛÒ. × äÒ,ÀØ ÝÞââ äãäÞÛ fl, ,,Ò ,Ò êÒêÛ âêãÞÇ^ã, ÛßÒê ,À,ÜÛ,ÜÛÒ, ãâêã,ÒÜãÒ â êã~ÝÛ ÑÒÜÛfl MAMMOTH, êÝÚÒ ÞÛÜ ÝãêãÀi ÜÒ ßÒÜÈ?Ò äflêÛ âêãÞÇ^ã,. ? ,êãã,,ã ÝÞââ ÇÀÞÛ ÛÑflêÀ ÛÑ ââßãêÒÜÛfl. ×âÒ,,ã , ÒÑëÞÈêêÒ ÇãêÀ äã^ÒëÀ äãÞë~ÛÞÛ 56 ßÒßÇÜÜÀi flÒ; , âÒÜÒß, Ü ,À,ÜÛ,ÜÛÒ äÛiãÛêâfl 5 flÒ; ÝãÞÛ~Òâê,ã flÒ , ,À,ÜÛ,ÜÛÛ ÝãÞÒÇÞÒêâfl ßÒÚë 1 Û 12. ôãßÒ êã,,ã, , êÒi ,À,ÜÛ,ÜÛfli ÛÑflÞÛ ÛÑ ââßãêÒÜÛfl äflêÈ âãßÜÛêÒÞÈÜÀi flÒ, , ÝãêãÀi ßÒÜÒÒ 60% âêãÞÇ^ã, fl,Þflêâfl ãâêã,ÒÜÀßÛ â êã~ÝÛ ÑÒÜÛfl âêëÝêëÜã,,ã ,À,ÜÛ,ÜÛfl. úâäÒÒÞÒÜÛÒ ÞÛÜ flÒ äÒâê,ÞÒÜã Ü ÛâëÜÝÒ. õÒêãÀ äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ äã ,À,ÜÛ,ÜÛ øã,ÒflÞÛ âÞÒëÛÒ ßÒêãÀ äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ: MEMSAT, FWDBCK, ãâ-


544

ûëêãßÛÜ, õÛãÜã,

ô~Òâê,ã äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ ÑÜÀßÛ ßÒêãßÛ õÒêã MEMSAT FWDBCK HMMTOP (ëâ.) HMMTOP (ãÛ,,.)
a

quality_fivea 0.964 0.977 0.934 0.916

quality_halfÇ 0.964 0.966 0.934 0.914

äãâÞÒã,êÒÞÈÜãâêÛ Ü fl Û Ý ÑßÒêÝÒ, äÒâÝÑÜÜãØ ßÒêããß HMMTOP, äÛßÒÜflÞÛ ÜÞã,,Û~ÜÀØ ãäÛâÜÜãßë ,À?Ò éÛÞÈê, äãÑ,ãÞflÛØ Û,,ÜãÛã,êÈ ßÒßÇÜÜÀÒ ë~âêÝÛ Û fl â ßÞãØ ÞÛÜãØ Û â ßÞãØ âêÒäÒÜÈ äÒÒâÒ~ÒÜÛfl â "ßâÝãØ ãâêã,ÒÜãâêÛ". úÒÑëÞÈêê äÛ,ÒÒÜ , êÇÞÛ^Ò ÜäãêÛ, äëÜÝê "HMMTOP (ãÛ,,.)". ù^ÒÜÝ Ý~Òâê, äÒâÝÑÜÛfl øÒÚÒ ~Òß ã^ÒÜÛ,êÈ Ý~Òâê,ã äÒâÝÑÜÜãØ êÒß ÛÞÛ ÛÜÀß ßÒêããß ÑßÒêÝÛ, ÛÑ ÜÒÒ ,ÀÝÛÀ,ê êÒ ßÒßÇÜÜÀÒ ë~âêÝÛ, ÝãêãÀß ÜÒÞÈÑfl ã,ÒflêÈ. øãÞ,,ÞÛ, ~êã ë~âêÝë ßãÚÜã ã,ÒflêÈ, ÒâÞÛ ,Ò êÒêÛ äãÝÀ,ÒßÀi Ûß âêãÞÇ^ã, ÛßÒê âêëÝêëÜãÒ ,À,ÜÛ,ÜÛÒ, ãâêã,ÒÜãÒ â êã~ÝÛ ÑÒÜÛfl MAMMOTH, êÝÚÒ ÒâÞÛ ÞÛÜ ë~âêÝ ÜÒ ßÒÜÈ?Ò äflêÛ âêãÞÇ^ã,. áÞfl ÝÚã,,ã ßÒêã äÒâÝÑÜÛfl Û Þfl ÝÚã,,ã ÝÞâêÒ ÇÀÞÛ äãâ~ÛêÜÀ ,Ò ,ÒÞÛ~ÛÜÀ ã^ÒÜÝÛ Ý~Òâê,. øÒ,fl - äã ÜÑ,ÜÛÒß quality_five, Ýãêãfl ÒâêÈ ~ÛâÞã flÒ, äãÝÀêÀi ÞÇÀß äÒâÝÑÜÜÀß ßÒßÇÜÜÀß ë~âêÝãß iãêfl ÇÀ Ü äflêÈ âêãÞÇ^ã,, ÒÞÒÜÜãÒ Ü ßÝâÛßëß ÛÑ ~ÛâÞ flÒ Û ~ÛâÞ äÒâÝÑÜÜÀi ë~âêÝã,. ×êãfl - quality_half, Ýãêãfl ÒâêÈ ~ÛâÞã flÒ, , ÝÚãß ÛÑ ÝãêãÀi iãêfl ÇÀ 50% âêãÞÇ^ã, äãÝÀêÀ ÞÇÀß äÒâÝÑÜÜÀß ë~âêÝãß, ÒÞÒÜÜãÒ Ü ßÝâÛßëß ÛÑ ~ÛâÞ flÒ Û ~ÛâÞ äÒâÝÑÜÜÀi ë~âêÝã,. ôÝ ,ÛÜã ÛÑ êÇÞÛ^À, Þë~?ÛÒ ÒÑëÞÈêêÀ Òê ßÒêã FWDBCK. úæñüó?ýÔý? ðûûóæáù×Ô÷ð? ÷ ÜÜÀØ ßãßÒÜê Ñ,ÛêÛfl ÇÛãÛÜéãßêÛÝÛ ãëÒêâfl ÜÒi,êÝ ÜÜÀi äã ßÒßÇÜÜÀß ÇÒÞÝß, Ü ÝãêãÀi ßãÚÜã äã,ÒflêÈ Ý~Òâê,ã ÇãêÀ ßÒêãã, ,êãßêÛ~ÒâÝã,,ã äÒâÝÑÜÛfl äãÞãÚÒÜÛØ ßÒßÇÜÜÀi ë~âêÝã, Û ßÒêãã,, âêãflÛi ßÜãÚÒâê,ÒÜÜÀÒ ,À,ÜÛ,ÜÛfl ßÛÜãÝÛâÞãêÜÀi äãâÞÒã,êÒÞÈÜãâêÒØ. × âÒÝ^ÛÛ ÇÑÀ ÜÜÀi Balibase [4], äãâ,flÒÜÜãØ ßÒßÇÜÜÀß ÇÒÞÝß, Þfl ÇãÞÈ?ÒØ ~âêÛ ,À,ÜÛ,ÜÛØ ÜÒ äÒâê,ÞÒÜ ßÒßÇÜÜfl ÑßÒêÝ, Ýãêãfl ßã,,Þ ÇÀ ÇÀêÈ äãÞë~ÒÜ Ü ãâÜã,Ò ÜÞÛÑ ÛÑ,ÒâêÜÀi êÒißÒÜÀi âêëÝêë, êÝÚÒ ÜÒ ,ÀÒÞflêâfl âêãÞÇ^À, ,À,ÜÛ,ÜÛ , ÝãêãÀi ßãÚÜã ã,ÒflêÈ â êã~ÝÛ ÑÒÜÛfl ßÒêã, âêãflÒ,,ã âêëÝêëÜÀÒ ,À,ÜÛ,ÜÛfl. × ÜÜãØ ÇãêÒ äãâêãÒÜ ,ÀÇãÝ ÝÞâêÒã, ßÒßÇÜÜÀi ÇÒÞÝã,, ,,Ò Þfl ÝÚã,,ã ÝÞâêÒ äãâêãÒÜã âêëÝêëÜãÒ ßÜãÚÒâê,ÒÜÜãÒ ,À,ÜÛ,ÜÛÒ Û ÜÜÒâÒÜÀ ßÒßÇÜÜÀÒ fl, ê.Ò. ,,ëääÀ âêãÞÇ^ã,, "ßÒßÇÜÜãâêÈ" ÝãêãÀi äãê,ÒÚÒÜ âêëÝêëãØ ÝÚã,,ã ÇÒÞÝ ÝÞâêÒ. ÿãêfl fl ÛßÒê âÒÜ ÞÛÜë 15.5, Ýãêãfl ÜÒßÜã,,ã ßÒÜÈ?Ò, ~Òß 21 (ãÇÒäÛÜflêfl âÒÜflfl ÞÛÜ ßÒßÇÜÜã,,ã ë~âêÝ ÇÒÞÝã,ãØ ^ÒäÛ), Üã äÛ êãß ãÜÛ ÜÒ âãÒÚê âãßÜÛêÒÞÈÜÀÒ âêãÞÇ^À. ýÝÚÒ ,ÀÒÞÒÜÀ âêãÞÇ^À, ãâêã,ÒÜÀÒ â êã~ÝÛ
õùóæôüó?ú÷Ô? ÕðùóùÙð? êãß 40 < 3 2006

ô~Òâê,ã äÒâÝÑÜÛfl, ÝãêããÒ ÒâêÈ ~ÛâÞã flÒ, äãÝÀêÀi ÞÇÀß äÒâÝÑÜÜÀß ßÒßÇÜÜÀß ë~âêÝãß iãêfl ÇÀ Ü äflêÈ âêãÞÇ^ã,, ÒÞÒÜÜãÒ Ü ßÝâÛßëß ÛÑ ~ÛâÞ flÒ Û ~ÛâÞ äÒâÝÑÜÜÀi ë~âêÝã,. Ç ô~Òâê,ã äÒâÝÑÜÛfl, ÝãêããÒ ÒâêÈ ~ÛâÞã flÒ, , ÝÚãß ÛÑ ÝãêãÀi iãêfl ÇÀ 50% âêãÞÇ^ã, äãÝÀêã ÞÇÀß äÒâÝÑÜÜÀß ßÒßÇÜÜÀß ë~âêÝãß, ÒÞÒÜÜãÒ Ü ßÝâÛßëß ÛÑ ~ÛâÞ flÒ Û ~ÛâÞ äÒâÝÑÜÜÀi ë~âêÝã,.

Üã,ÜÜÀØ Ü ãäÛâÜÜãß ,À?Ò ßÒêãÒ éãßÛã,ÜÛfl êÜâßÒßÇÜÜã,,ã ,ÒãflêÜãâêÜã,,ã äãéÛÞfl, Û ßÒêã ëâÒÜÒÜÛfl ÒÑëÞÈêêã, âÒ,Ò HMMTOP [22] äã ÇÒÞÝß , ,À,ÜÛ,ÜÛÛ (ÞÒÒ ëâÒÜÒÜÛÒ HMMTOP). ÷ ,iã âÒ,Òë MEMSAT äã,ÞÛ ßÛÜãÝÛâÞãêÜÀÒ ~âêãêÜÀÒ äãéÛÞÛ ,À,ÜÛ,ÜÛØ â ë~Òêãß ,Òâã, äãâÞÒã,êÒÞÈÜãâêÒØ, Üã ÇÒÑ ë~Òê ÒÞÒ^ÛØ. úÑßÒêÝë Ü ßÒßÇÜÜÀÒ ë~âêÝÛ FWDBCK éãßÛã,ÞÛ êÝ, ~êã âêãÞÇ^À, ,ÒãflêÜãâêÈ ÜiãÚÒÜÛfl ÝãêãÀi , ßÒßÇÜÒ ÇÀÞ ÜÒ ßÒÜÈ?Ò 0.8, ãÇfl,ÞflÞÛ ßÒßÇÜÜÀßÛ. æâÞÛ ,âêÒ~ÞãâÈ ßÒÜÒÒ äflêÛ ßÒßÇÜÜÀi âêãÞÇ^ã,, âêãflÛi ,ßÒâêÒ, êã Ûi ÜÒ â~ÛêÞÛ ßÒßÇÜÜÀßÛ. õÒêã ëâÒÜÒÜÛfl HMMTOP ëâêãÒÜ êÝ. øÛ ,À,ÜÛ,ÜÛÛ Ü ÝÚë äãâÞÒã,êÒÞÈÜãâêÈ ÜÜãâÛêâfl ßÒßÇÜÜfl ÑßÒêÝ, äÒâÝÑÀ,Òßfl âÒ,Òãß HMMTOP. ûêãÞÇ^À, , ÝãêãÀi, ÝÝ ßÛÜÛßëß, ,Ò êÒêÛ ÇÒÑÒÞÒ^ÛãÜÜÀi äãÑÛ^ÛØ äãßÒ~ÒÜÀ ÝÝ ßÒßÇÜÜÀÒ, ãÇfl,ÞflÞÛ ßÒßÇÜÜÀßÛ. æâÞÛ ,âêÒ~ÞãâÈ ßÒÜÒÒ äflêÛ ßÒßÇÜÜÀi âêãÞÇ^ã,, âêãflÛi ,ßÒâêÒ, êã Ûi ÜÒ â~ÛêÞÛ ßÒßÇÜÜÀßÛ. ù^ÒÜÝ Ý~Òâê, ÇãêÀ ßÒêã HMMTOP áÞfl êã,,ã ~êãÇÀ ëÇÒÛêÈâfl , êãß, ~êã ßÒêãÀ äÒâÝÑÜÛfl ßÒßÇÜÜãØ ÑßÒêÝÛ, ãäÛÛÒâfl Ü ,À,ÜÛ,ÜÛÒ, Çãêê Þë~?Ò, ~Òß ßÒêãÀ, ÛßÒÛÒ ÒÞã êãÞÈÝã â ãÜãØ äãâÞÒã,êÒÞÈÜãâêÈ, äã,ÒflÞÛ Ý~Òâê,ã ÇãêÀ ßÒêã HMMTOP Þfl ÝÚãØ ÇÒÞÝã,ãØ äãâÞÒã,êÒÞÈÜãâêÛ ÝÚã,,ã ÝÞâêÒ. û êãØ ^ÒÞÈ Þfl ÝÚãØ äãâÞÒã,êÒÞÈÜãâêÛ éãßÛã,ÞÛ "âëÚÒÜÛÒ" ÛÜéãß^ÛÛ ã ãâêã,ÒÜãâêÛ âêãÞÇ^ã, , âêëÝêëÜãß ,À,ÜÛ,ÜÛÛ âããê,Òêâê,ëÒ,,ã ÝÞâêÒ äëêÒß ,ÀÇâÀ,ÜÛfl âêãÞÇ^ã,, , ÝãêãÀi ââßêÛ,Òßfl äãâÞÒã,êÒÞÈÜãâêÈ ÛßÒÒê ÒÞÒ^Û. ÔÜÞã,,Û~Üã âêãÛÞÛ ÑßÒêÝë äãâÞÒã,êÒÞÈÜãâêÛ Ü fl, Ýãêãfl ÒâêÈ âëÚÒÜÛÒ ÑßÒêÝÛ Ü fl ,âÒ,,ã ,À,ÜÛ,ÜÛfl. áÞÒÒ, Ý ÑßÒêÝÒ


׿úù?ý÷ùûý÷?ò õæýùá øúæáûôÔñÔ÷ð? ýúÔ÷ûõæõÕúÔ÷÷?ÿ ü?Ôûýôù×

545

ÑÒÜÛfl ßÒêã âêëÝêëÜã,,ã ,À,ÜÛ,ÜÛfl. ýÝÛß ãÇÑãß, ÜÜfl ,ÀÇãÝ (ÜÒâßãêfl Ü ßÞÀØ ÑßÒ) ßãÚÒê â ë,ÒÒÜÜãâêÈ ÇÀêÈ ÛâäãÞÈÑã,Ü Þfl äã,ÒÝÛ Ý~Òâê, ßÒêãã,, äÒâÝÑÀ,Ûi ßÒßÇÜÜë ÑßÒêÝë ÛÞÛ âêãflÛi ßÜãÚÒâê,ÒÜÜÀÒ ,À,ÜÛ,ÜÛfl. û ë,,ãØ âêããÜÀ, ÑÇãêÜ ßÒêã éãßÛã,ÜÛfl ßÒßÇÜÜã,,ã ,ÒãflêÜãâêÜã,,ã äãéÛÞfl. ÔÒÝ,êÜãâêÈ ßÒêã äã,ÒÒÜ Ü ãâÜã,Ò äÒâÝÑÜÛfl äã ÜÒßë ßÒßÇÜÜãØ ÑßÒêÝÛ (âß. FWDBCK , êÇÞÛ^Ò). ô~Òâê,ã êã,,ã äÒâÝÑÜÛfl ãÝÑÞãâÈ ÜÒâÝãÞÈÝã Þë~?Ò, ~Òß ë ÜÛÇãÞÒÒ êã~ÜÀi ßÒêãã,, , ÝãêãÀi ÜÒ äÛÇÒ,,ê Ý ,,ãßãÞã,,Û~Üãßë äãÛâÝë , ãäãÞÜÛêÒÞÈÜãß ÇÜÝÒ ÜÜÀi. ýÝÚÒ êÝãØ äãéÛÞÈ ßãÚÒê ÇÀêÈ ÛâäãÞÈÑã,Ü äÛ äãâêãÒÜÛÛ ßÜãÚÒâê,ÒÜÜÀi ,À,ÜÛ,ÜÛØ äãâÞÒã,êÒÞÈÜãâêÒØ ßÒßÇÜÜÀi ÇÒÞÝã,. æâÞÛ ßÒêã ,À,ÜÛ,ÜÛfl "äã,,ÒââÛ,ÜÀØ", êã Ü ÝÚãß ?,,Ò âãÒÛÜÒÜÛfl äãéÛÞÒØ ,ëi äã,À,ÜÛ,ÜÛØ , ãÛÜ ßãÚÜã ëÞë~?êÈ ÒÑëÞÈêÛëÒÒ ,À,ÜÛ,ÜÛÒ, ,ÈÛëfl Þfl ÝÚã,,ã âêãÞÇ^ êÝÛÒ äßÒêÀ, ÝÝ ßêÛ^ ÑßÒÜ, ?êéÀ Ñ ãêÝÀêÛÒ Û äããÞÚÒÜÛÒ ÒÞÒ^ÛØ, , Ñ,ÛâÛßãâêÛ ãê êã,,ã, ÝÝã, ,ÒãflêÜãâêÈ Þfl ßÛÜãÝÛâÞãê ÜÜã,,ã âêãÞÇ^ ÞÒÚêÈ , ßÒßÇÜÒ. ôãßÒ êã,,ã, ÑÇãêÜ ÛÜêÒÜÒê-âÒ,Ò, ,,Ò äãÞÈÑã,êÒÞÈ ßãÚÒê Þfl â,ãÒ,,ã ,À,ÜÛ,ÜÛfl äãÞë~ÛêÈ ßÒßÇÜÜÀØ ,ÒãflêÜãâêÜÀØ äãéÛÞÈ. ûÒ,Ò Û êÒâêã,fl ,ÀÇãÝ ãâêëäÜÀ äã Òâë http://bioinf.fbb.msu.ru/fwdbck/. úÇãê äãÞë~ÛÞ éÛÜÜâã,ë äãÒÚÝë úãââÛØâÝãØ ÝÒßÛÛ ÜëÝ (äã,,ßßÀ "õãÞÒÝëÞflÜfl Û ÝÞÒêã~Üfl ÇÛãÞã,,Ûfl" Û "øãÛâiãÚÒÜÛÒ Û ,ãÞ^Ûfl ÇÛãâéÒÀ"), éãÜ Howard Hughes Medical Institute (,,Üê 55000309) êÝÚÒ úãââÛØâÝã,,ã éãÜ éëÜßÒÜêÞÈÜÀi ÛââÞÒã,ÜÛØ (0504-48759). ûøðûùô óðýæúÔýüú?
1. Thompson J.D., Higgins D.G., Gibson T.J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. 2. Jaroszewski L., Li W., Godzik A. 2002. In search for more accurate alignments in the twilight zone. Protein Sci. 11, 1702-1713. 3. Zhang H., Cramer W.A. 2005. Problems in Obtaining Diffraction-quality Crystals of Heterooligomeric Integral Membrane Proteins. J. Struct. Funct. Genomics. 6, 219-223. 4. Bahr A., Thompson J.D., Thierry J.C., Poch O. 2001. BAliBASE (Benchmark Alignment dataBASE): en-

5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

17.

18. 19.

20.

21.

22.

hancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res. 29, 323-326. Sutormin R.A., Rakhmaninova A.B., Gelfand M.S. 2003. BATMAS30: amino acid substitution matrix for alignment of bacterial transporters. Proteins. 51, 85-95. Heringa J. 1999. Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput. Chem. 23, 341-364. Ng P.C., Henikoff J.G., Henikoff S. 2000. PHAT: a transmembrane-specific substitution matrix. Predicted hydrophobic and transmembrane. Bioinformatics. 16, 760-766. Do C.B., Mahabhashyam M.S., Brudno M., Batzoglou S. 2005. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 15, 330-340. Chen C.P., Kernytsky A., Rost B. 2002. Transmembrane helix predictions revisited. Protein Sci. 11, 2774-2791. Rost B., Liu J. 2003. The PredictProtein server. Nucleic Acids Res. 31, 3300-3304. Jones D.T. 1998. Do transmembrane protein superfolds exist? FEBS Lett. 423, 281-285. Krogh A., Mian I.S., Haussler D. 1994. A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22, 4768-4778. Zuckerkandl E., Pauling L. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8, 357-366. Saitou N., Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406-425. Gerstein M., Sonnhammer E.L., Chothia C. 1994. Volume changes in protein evolution. J. Mol. Biol. 236, 1067-1078. Sonnhammer E.L., von Heijne G., Krogh A. 1998. A hidden Markov model for predicting transmembrane helices in protein sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 6, 175-182. Tusnady G.E., Dosztanyi Z., Simon I. 2005. PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res. 33, 275-278. Lupyan D., Leo-Macias A., Ortiz A.R. 2005. A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics. 21, 3255-3263. Murzin A.G., Brenner S.E., Hubbard T., Chothia C. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536-540. Orengo C.A., Michie A.D., Jones S., Jones D.T., Swindells M.B., Thornton J.M. 1997. CATH - a hierarchic classification of protein domain structures. Structure. 5, 1093-1108. Tusnady G.E., Dosztanyi Z., Simon I. 2005. TMDET: web server for detecting transmembrane regions of proteins by using their 3D coordinates. Bioinformatics. 21, 1276-1277. Tusnady G.E., Simon I. 1998. Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J. Mol. Biol. 283, 489-506.

11 õùóæôüó?ú÷Ô? ÕðùóùÙð? êãß 40 < 3 2006