Документ взят из кэша поисковой машины. Адрес оригинального документа : http://mccmb.belozersky.msu.ru/2013/abstracts/abstracts/212.pdf
Дата изменения: Mon Jun 3 19:10:02 2013
Дата индексирования: Thu Feb 27 21:10:59 2014
Кодировка:
Nhunt: solving the low complexity problem for nucleic acid homology search Yury Pekov1, Kirill Romanenkov2, Alexey Salnikov2, Sergei Spirin
1 3 3

Faculty of Bioengineering and Bioinformatics, 2Faculty of Computational Mathematics and Cybernetics, and Belozersky Institute of Physico-Chemical Biology, Moscow State University. E-mail: sas@belozersky.msu.ru

We introduce Nhunt, a program for sensitive nucleic acid homology search. It is designed for searching for distant homologues of non-coding sequences, such as various non-coding RNA, genomic repeats, introns etc. The well-known "low complexity problem" appears when the frequencies of nucleotides in two sequences under comparison significantly differ from one-fourth. If two sequences with equally shifted composition are aligned using the usual scoring scheme, then the resulting alignment would be significantly overscored. For such regions it is reasonable to use an adjusted substitution scoring matrix. Any substitution matrix can be valid in at most one background frequen cy context. Our goal is to transform the initial substitution matrix s into a target matrix S that is valid in a given (nonstandard) context. The same problem exists for protein alignment and protein homology search. In program BLASTP, the matrix adjustment suggested by Yu, Wooton and Altschul [1] is used, which helps avoid the problem in protein homology search. For nucleic acid homology search, no tools of this sort are available at the moment. We have developed a number of algorithms for nucleic acid matrix adjustment and demonstrated some results of their testing on several biological examples. In most cases usage of the adjustment significantly improves quality of search. The algorithms are implemented in the program Nhunt, which is realized on C. Executable files for Linux and the source code are available in Internet: http://mouse.belozersky.msu.ru/~bennigsen/nhunt.html . A parallel version of Nhunt and a web interface to it are included into the system «Processing biosequences on multiprocessors» (http://angel.cs.msu.su/temp-aligner/). That allows to run Nhunt on supercomputers of Moscow State University. The work is supported by a joint grant of Russian Foundation of Basic Research (grant no. 12-0491334) and German Research Foundation (grant IRTG 1563/1). 1. Yu, Wooton, Altschul. The compositional adjustment of amino acid susbstitution matrices. PNAS 2003; 100; 15688­15693.