Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://dualopt1.cmm.msu.ru/pub/Education/NanoMod/modern_approaches_3.pdf
Äàòà èçìåíåíèÿ: Thu May 6 18:41:49 2010
Äàòà èíäåêñèðîâàíèÿ: Fri Feb 28 20:28:00 2014
Êîäèðîâêà:
.
· · · , · · ­ · PFAM ·



His57-Asp 102-Ser195
CTRB_HUMAN CTR2_CANFA CTRB_BOVIN CTRB_RAT CTRA_BOVIN CTRA_GADMO CTRL_HUMAN TRY2_BOVIN TRY2_CANFA TRY1_HUMAN TRY1_RAT TRY2_RAT TRY1_CANFA TRY1_BOVIN TRYP_PIG TRY2_XENLA TRY1_XENLA TRY1_CHICK TRY2_CHICK EL3B_HUMAN EL3A_HUMAN CAC3_BOVIN EL2_MOUSE EL2_RAT EL2_PIG EL2A_HUMAN CLCR_HUMAN WGRRITDVMICAG--ASGVSSCMGDSGGPLVCQKD-GAWTLVGIVSWGSWGSKITDLMVCAG--ASGVSSCMGDSGGPLVCQKD-GAWTLVGIVSWGSWGSRVTDVMICAG--ASGVSSCMGDSGGPLVCQKN-GAWTLAGIVSWGSWGSKITDVMTCAG--ASGVSSCMGDSGGPLVCQKD-GVWTLAGIVSWGSWGTKIKDAMICAG--ASGVSSCMGDSGGPLVCKKN-GAWTLVGIVSWGSWGNKISDLMICAG--AAGASSCMGDSGGPLVCQKA-GSWTLVGIVSWGSWGSSITDSMICAG--GAGASSCQGDSGGPLVCQKG-NTWVLIGIVSWGTYPGQITNNMICAGFLEGGKDSCQGDSGGPVACNG-----QLQGIVSWGYYPGQITENMICAGFLEGGKDSCQGDSGGPVVCNG-----ELQGIVSWGYYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVCNG-----QLQGVVSWGDYPGEITSSMICVGFLEGGKDSCQGDSGGPVVCNG-----QLQGIVSWGYYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNG-----ELQGIVSWGYYPGQISSNMMCLGYMEGGKDSCQGDSGGPVVCNG-----ELQGVVSWGAYPGQITSNMFCAGYLEGGKDSCQGDSGGPVVCSG-----KLQGIVSWGSYPGQITGNMICVGFLEGGKDSCQGDSGGPVVCNG-----QLQGIVSWGYYPGEITKNMFCAGFLAGGKDSCQGDSGGPVVCNG-----QLQGVVSWGYYPGEITANMICVGYMEGGKDSCQGDSGGPVVCNG-----QLQGVVSWGYYPGRITSNMICIGYLNGGKDSCQGDSGGPVVCNG-----QLQGIVSWGIYPGRITSNMICIGYLNGGKDSCQGDSGGPVVCNG-----QLQGFVSWGIWGSSVKKTMVCAG-GD-IRSGCNGDSGGPLNCPTEDGGWQVHGVTSFVSA WGSTVKKTMVCAG-GY-IRSGCNGDSGGPLNCPTEDGGWQVHGVTSFVSG WGITVKKTMVCAG-GD-TRSGCNGDSGGPLNCPAADGSWQVHGVTSFVSA WGSSVKSSMVCAG-GDGVTSSCNGDSGGPLNCRASNGQWQVHGIVSFGSS WGSSVKTNMVCAG-GDGVTSSCNGDSGGPLNCQASNGQWQVHGIVSFGST WGSTVKTNMICAG-GDGIISSCNGDSGGPLNCQGANGQWQVHGIVSFGSS WGSSVKTSMICAG-GDGVISSCNGDSGGPLNCQASDGRWQVHGIVSFGSR WGFRVKKTMVCAG-GDGVISACNGDSGGPLNCQLENGSWEVFGIVSFGSR

·, · · , / , ·



pij i j . ( ), :

Hj = -

i

pij log2(pij)

Hm = log2(20)=4.3


Subtilisin (1SBC, 344 sequences)
3 2,5 2 1,5 1 0,5 0 0 20 40 60 80 100 120 140 160 180 200 220 240

Shen on's e ntr opy

AA#

D32, H64, S221 ­ active site; 65,83,154,202,219 ­ Gly; S125, N155, T220


Trypsin (500 proteins)
3 2,5

Shenon's ent ropy

2 1,5 1 0,5 0 1

4 G

25 38 40 41 C AHC

51

84 D

101

120-122 GYG

148 C

151

162 175-180 189 C GDSGGP G

201

215 W


LOG
ID ADH_IRON_1; BLOCK AC BL00913C; distance from previous block=(56,76) DE Ironcontaining alcohol dehydrogenases proteins. BL HHG motif; width=22; seqs=11; 99.5%=492; strength=1428 ADHE_CLOAB ( 720) CHSMAIKLSSEHNIPSGIANAL 66 FUCO_ECOLI ( 262) VHGMAHPLGAFYNTPHGVANAI 44 GLDA_BACST ( 259) HNGFTALEGEIHHLTHGEKVAF 100 GLDA_ECOLI ( 269) VHNGLTAIPDAHHYYHGEKVAF 100 MEDH_BACMT ( 259) VHSISHQVGGVYKLQHGICNSV 78 ADH1_CLOAB ( 258) CHSMAHKTGAVFHIPHGCANAI 47 ADHE_ECOLI ( 721) CHSMAHKLGSQFHIPHGLANAL 47 ADH2_ZYMMO ( 261) VHAMAHQLGGYYNLPHGVCNAV 36 ADH4_YEAST ( 263) VHALAHQLGGFYHLPHGVCNAV 41 ADHA_CLOAB ( 266) CHPMEHELSAYYDITHGVGLAI 50 ADHB_CLOAB ( 266) VHLMEHELSAYYDITHGVGLAI 49

Shmuel Pietrokovski pietro@weizmann.ac.il
(http://bioinformatics.weizmann.ac.il/~pietro/Making_and_u sing_protein_MA/)


PSI-BLAST(Position Specific Iterated BLAST):
· (, ), .. ( ) , score , . ( ) score

· ·

·


(Hidden Markovian Models)
. , / . : ·/(match state) () ·/(insert state) () ·/(delete state) () «» «» . . - 1.


ACCY

P(ACCY)=.4 * .3 * .46 * .6 * .97 * .5 * .015 * .73 *.01 * 1 = 1.76x10

-6

HMM ( / ) « » HMM .


HMM
·SUPERFAMILY (http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/index.html)
HMM (4894 ), 95%, . . ·PFAM (http://www.sanger.ac.uk/Software/Pfam/) HMM 11912 (Protein FAMilies) . (v.24, October 2009) 75% . : 1. 2. 3. , 4. , ,


Pseudomonas sp. 101 (FDH_PSESR, P33160) (PF00389; 2-Hacid_dh) (PF02826; 2-Hacid_dh_C): 2-Hacid_dh 47-144; 2-Hacid_dh_C 150-332
Lineage: Root: scop Class: Alpha and beta proteins (a/b) Mainly parallel beta sheets (beta-alpha-beta units) Fold: NAD(P)-binding Rossmann-fold domains core: 3 layers, a/b/a; parallel beta-sheet of 6 strands, order 321456 The nucleotide-binding modes of this and the next two folds/superfamilies are similar Superfamily: NAD(P)-binding Rossmann-fold domains Family: Formate/glycerate dehydrogenases, NAD-domain this domain interrupts the other domain which defines family Protein: Formate dehydrogenase Species: Pseudomonas sp., strain 101 PDB Entry Domains: 2nac (complexed with so4) 1. region a:148-335 2. region b:148-335 2nad (complexed with azi, nad, so4) 3. region a:148-335 4. region b:148-335 Lineage: Root: scop Class: Alpha and beta proteins (a/b) Mainly parallel beta sheets (beta-alpha-beta units) Fold: Flavodoxin-like 3 layers, a/b/a; parallel beta-sheet of 5 strand, order 21345

Superfamily: Formate/glycerate dehydrogenase catalytic domain-like Family: Formate/glycerate dehydrogenases, substratebinding domain this domain is interrupted by the Rossmann-fold domain Protein: Formate dehydrogenase contains an additional beta-hairpin after the common fold Species: Pseudomonas sp., strain 101 PDB Entry Domains: 2nac (complexed with so4) 1. region a:1-147,a:336-374 1. region b:1-147,b:336-374 2nad (complexed with azi, nad, so4) 1. region a:1-147,a:336-391 2. region b:1-147,b:336-383


. SCOP
http://scop.mrc-lmb.cam.ac.uk/scop/index.html SCOP: Structural Classification of Proteins. 1.65 release 20619 PDB Entries (1 August 2003). 54745 Domains. (excluding nucleic acids and theoretical models) Class All alpha proteins All beta proteins folds superfamilies 179 126 299 248 199 349 families 480 462 542 567

Alpha and beta proteins (a/b) 121 Alpha and beta proteins (a+b) 234

Multi-domain proteins
Membrane proteins Small proteins Total

38
36 66 800

38
66 95 1294

53
73 150 2327

· . . . ( ) · : 30% , ( 15%) · ·


( )
1igt

1tcr

2hla

http://www.rcsb.org/pdb/molecules/pdb62_3.html


. CATH
CATH v2.5.1 http://www.biochem.ucl.ac.uk/bsm/cath_new/


De novo ROSETTA, D. Baker et. al. Science 310, 638 (2005)
1973 . , / . ( ) . . - , . , . (???).

ROSETTA:
· , (SASA) · (CHARMM/AMBER ) · ·


ROSETTA
· / · ( ),

( ):
. ­ -- . , (almost) ( Pentium 4 3 Gz) , . . Monte-Carlo . . 3 15 CPU days 3.2 Gz.



· · 40% , « » (.. .. ) de novo De novo 150 20%. () 600000 ( ) , 2/3 30% , 30% 16000 , 90% , . D. Baker, A. Sali Science 294, 93-96 (2001).

· ·

·

·