Документ взят из кэша поисковой машины. Адрес оригинального документа : http://kodomo.fbb.msu.ru/FBB/StudentScience/diplom_2006/Rakhmaninova.doc
Дата изменения: Mon Dec 12 17:46:09 2005
Дата индексирования: Tue Oct 2 12:34:28 2012
Кодировка: koi8-r

Уважаемые студенты 3-его и 4-го курса!

Мы получили приглашение к сотрудничеству от наших зарубежных
коллег из Buck Institute for Age Research (http://www.buckinstitute.org).
Коллеги предлагают студентам ФББ участвовать в создании нового
биоинформатического инструмента для биологов-экспериментаторов и в
апробировании инструмента в конкретных исследованиях белок-белковых
взаимодействий.

Кратко суть биоинформатической (технической) части проекта
состоит в создании
удобного инструмента для решения потока задач следующего типа
Дано: в результате эксперимента получен короткий аминокислотный паттерн.
Задача: получить разумным образом организованный список функций и (или)
cвойств белков,
в последовательностях которых есть точно такой же или родственный
паттерн.

Следует заметить, что проект не сводится к написанию программы,
качество Вашего инструмента должно быть подтверждено биологически значимыми
результатами.

Подробнее, см. аннотацию.

Руководители проекта:
супервайзер - проф. ФББ А.А.Миронов
ко-супервайзер - Dr. Alexei Kurakin, Buck Institute for Age Research,
CA, USA
тьютор - А.Б.Рахманинова

Телефон для контакта: 939-43-31.
e-mail - abr@belozersky.msu.ru

Данная задача может быть темой как курсового проекта, так и
диплома. В последнем случае требования к количеству и качеству полученных
Вами результатов будут выше.

Мы приглашаем одного или двух студентов, которым
. интересно разобраться в классификации функций белков;
. интересно разобраться в проблемах структуры и динамики сложных
сетей;
. интересно освоить современные методы кластеризации данных,
параметрические и непараметрические;
. who is interested to learn working in the professional atmosphere
of the cutting-edge computational biology.

Требования к студентам:

- приемлемый уровень умения программировать (Java);

- Communication - fluent spoken and written English is a must as
the project is likely to involve a short stay at the Buck Institute, CA .

Аннотация проекта
«Инструменты для поиска аминокислотных последовательностей по короткому
паттерну с дальнейшей кластеризацией найденных белков по их функциям»

Project "Pattern-Oriented Retrieval and Clusterization of Sequences "
(or PORCOS for short )

Introduction
The biological organism is a highly organized system of molecular
interactions. Proteins are principal carriers of specificity in this
organization. Complex and dynamic protein interaction networks mediate the
translation of genotype to phenotype. Defining the structure of these
networks and their components is an imperative prerequisite for
understanding the cell and the organism as complex molecular systems. In
the absence of a self-sufficient high-throughput methodology for mapping
of protein interactions and millions of potential specific pair-wise
interactions realized within the circuitry of the cell, we focus on
proteome-wide analysis of the functionalities integrated around the most
important proteins, or hubs, i.e. the proteins that are highly connected,
or have a great number of interacting partners.
It has been recently shown that protein interaction networks,
defined as proteins being nodes and their specific pair-wise physical
associations being links, are well approximated by scale-free network
models. The latter are characterized by highly non-homogeneous distribution
of links, where a few nodes (hubs) have a large number of interacting
partners, while the majority of nodes have few connections, with
implication for a dominant role of hubs in the overall performance and
integration of the network. This suggestion is supported by proteome-scale
analysis of mutational phenotypes in the yeast system.
The primary candidates for highly connected nodes are proteins
comprising multiple copies of protein interaction domains (PIDs) such as
SH3, PDZ, WW, BIR etc.. These domains mediate protein interactions by
recognizing and binding short and usually linear peptide epitopes within
their interacting partners. Therefore, the proteome-wide interactions of a
protein consisting of multiple copies of PIDs can be well approximated by
characterization of interactions between the protein's individual PIDs and
the proteomic fragments featuring the corresponding PID binding motifs.
Clustering of interacting partners of the hub under investigation in the
functional space of the proteome is expected to reveal the functional
associations of the hub and provide information about spatio-temporal
organization and integration of different cellular functionalities at the
systems level.

Project

The first part of proposed project (mainly technical)

Development of the web-based interface for automatic retrieval and
databasing of the protein sequences (and the matching sequences themselves,
for further analysis and manipulation, if necessary) that feature a user-
defined pattern, like say Px(P/A)XXR or PxLPxK

The second part (the research part )

At first, to map the selected protein set onto the GO tree at different
hierarchical levels, checking and choosing the best statistical
significance of the resulting distributions. May not work due to
a. insufficient statistics;
b. inadequacy of functional annotation;
c. both.
The inadequacy of functional annotation may be assumed as a fact .
Therefore we will search for the ways of meaningful clusterization, for
example, trying
1) to map the selected set of proteins on the known and documented
protein-protein relationships
such as
a) protein-protein interactions;
b) mRNA co-expresssion;
c) homology;
d) conserved domain structure;
e) may be you can add here more ...
2) to use non-parametric clustering to identify hidden "cliques" in the
selected set of proteins;
3) your proposal...