Документ взят из кэша поисковой машины. Адрес оригинального документа : http://kodomo.fbb.msu.ru/hg/allpy/file/df624c729ab5/utils/freqs.py
Дата изменения: Unknown
Дата индексирования: Mon Feb 4 01:02:27 2013
Кодировка:
allpy: df624c729ab5 utils/freqs.py

allpy

view utils/freqs.py @ 690:df624c729ab5

Implemented SequenceCaseMarkup along with the required changes to Monomer.
author Daniil Alexeyevsky <dendik@kodomo.fbb.msu.ru>
date Tue, 05 Jul 2011 18:30:42 +0400
parents 7184863832b9
children aa3cf3b44f86
line source
1 """Read alignment on stdin. Print CSV table of letter frequences on stdout.
2 """
3 from allpy import protein
4 from allpy.data import codes
5 import sys
7 sys.stderr.write(__doc__)
9 def freq(monomer):
10 amount = freqs.get(monomer)
11 if amount:
12 return 100.0 * amount / width
13 return ""
15 aln = protein.Alignment().append_file(sys.stdin)
16 monomers = [code1 for code1, modified, _, _ in codes.protein if not modified]
17 monomers += ["-"]
18 width = len(aln.sequences)
19 print ", ".join(map(str, monomers))
20 for column in aln.columns_as_lists():
21 freqs = {}
22 for monomer in column:
23 if monomer:
24 monomer = monomer.code1
25 else:
26 monomer = "-"
27 freqs[monomer] = freqs.get(monomer, 0) + 1
28 print ", ".join(map(str, map(freq, monomers)))