view test/usecase2.py @ 817:835efa2a8c71
optimization of rasmol_homology: keep structure loaded of two sequences only
One of steps of this program is superimposition of all sequences
with main sequence and saving of all structures to pdb file.
Loaded structure of all sequences is not needed to do this.
At every moment only structure of main sequence and of superimposing sequence.
This optimization results in essential memory saving.
Output files should be the same to previous revision.
To implement this optimization methods supeimpose and save_pdb
of alignment were replaced with methods with same names of sequence.
So some code is same as code of methods of alignment.
Note: behaves as before, with superimpose and save_pdb methods of alignment.
Model was returned by these methods but never used while generating spt script.
This can result in collisions of rasmol selections when number of sequences is
greater than max number of chains of one model.
author |
boris (kodomo) <bnagaev@gmail.com> |
date |
Fri, 15 Jul 2011 02:23:27 +0400 |
parents |
bed32775625a |
children |
ddf85d0a8924 |
line source
3 from allpy.processors import Needle, Left
4 from allpy.fileio import FastaFile
5 from collections import deque
10 def has_identity(column):
11 as_list = column.values()
12 return len(column) == 2 and as_list[0] == as_list[1]
14 def is_good_window(window):
15 sum_id = sum(int(has_identity(column)) for column in window)
16 return len(window) == width and sum_id >= threshold
18 def find_runs(alignment):
19 window = deque([], width)
22 for column in alignment.columns:
24 in_block, was_in_block = is_good_window(window), in_block
25 if in_block and not was_in_block:
26 block = dna.Block.from_alignment(alignment, columns=list(window))
29 block.columns.append(column)
32 def blocks_markup(alignment, blocks):
33 for column in alignment.columns:
36 for column in block.columns:
38 return "".join(column.in_block for column in alignment.columns)
41 alignment = dna.Alignment().append_file(sys.stdin)
42 assert len(alignment.sequences) == 2, "Input must have TWO sequences!"
43 alignment.realign(Left())
44 alignment.realign(Needle())
45 blocks = find_runs(alignment)
47 for n, block in enumerate(blocks, 1):
48 block.to_file(open("block_%02d.fasta" % n, "w"))
50 alignment.to_file(sys.stdout)
51 FastaFile(sys.stdout).write_string(
52 blocks_markup(alignment, blocks),
54 "In run with window %s and threshold %s" % (width, threshold)
60 print "An error has occured:", e