Документ взят из кэша поисковой машины. Адрес оригинального документа : http://kodomo.fbb.msu.ru/hg/allpy/file/afed1fd8920c/test/usecase2.py
Дата изменения: Unknown
Дата индексирования: Mon Feb 4 05:17:52 2013
Кодировка:
allpy: afed1fd8920c test/usecase2.py

allpy

view test/usecase2.py @ 1091:afed1fd8920c

Added backreferences to `Seqeunce`s from `Monomer`s (closes #49) WARNING! Please note that `Sequence` API almost changed entirely! WARNING! This commit immediately obsoletes classmethods `Monomer.from_code*`, `Monomer.from_name` and `Sequence.from_monomers`. Turns out, python can not pickle sets/dicts which have keys, which inderecly reference the set/dict itself: http://bugs.python.org/issue9269 -- which is excatly what we have in abundance after this change. To allow pickling added `__getstate__` to `Monomer` to return all attributes, except `sequence` and `__setstate__` to `Sequence`, which runs through all monomers and returns the `sequence` attribute back to where it belongs. WARNING! This MAY result in unexpected behaviour in some cases. (Which should be rare enough).
author Daniil Alexeyevsky <dendik@kodomo.fbb.msu.ru>
date Sat, 02 Jun 2012 19:33:42 +0400
parents cc1959669928
children 08d892230e8c
line source
1 #!/usr/bin/python
2 import sys
3 from collections import deque
4 from allpy import dna
5 from allpy.processors import Needle, Left
6 from allpy.fileio import FastaFile
7 from allpy.util import open
9 width = 15
10 threshold = 10
11 if __name__ == "__main__":
12 infile = open(sys.argv[1])
13 outfile = open(sys.argv[2])
15 def has_identity(column):
16 as_list = column.values()
17 return len(column) == 2 and as_list[0] == as_list[1]
19 def is_good_window(window):
20 sum_id = sum(int(has_identity(column)) for column in window)
21 return len(window) == width and sum_id >= threshold
23 def find_runs(alignment):
24 window = deque([], width)
25 blocks = []
26 in_block = False
27 for column in alignment.columns:
28 window.append(column)
29 in_block, was_in_block = is_good_window(window), in_block
30 if in_block and not was_in_block:
31 block = dna.Block.from_alignment(alignment, columns=list(window))
32 blocks.append(block)
33 elif in_block:
34 block.columns.append(column)
35 return blocks
37 def blocks_markup(alignment, blocks):
38 for column in alignment.columns:
39 column.in_block = "-"
40 for block in blocks:
41 for column in block.columns:
42 column.in_block = "+"
43 return "".join(column.in_block for column in alignment.columns)
45 def main():
46 alignment = dna.Alignment().append_file(infile)
47 assert len(alignment.sequences) == 2, "Input must have TWO sequences!"
48 alignment.realign(Left())
49 alignment.realign(Needle())
50 blocks = find_runs(alignment)
52 for n, block in enumerate(blocks, 1):
53 block.to_file(open("block_%02d.fasta" % n, "w"))
55 alignment.to_file(outfile)
56 FastaFile(outfile).write_string(
57 blocks_markup(alignment, blocks),
58 "markup",
59 "In run with window %s and threshold %s" % (width, threshold)
60 )
62 try:
63 main()
64 except Exception, e:
65 print "An error has occured:", e