CCMB Seminar Series 2004-2005
_______________________________________________________ Events
Center for Computational Molecular Biology Seminar Series
Biological and Bioinformatic Approaches to Identifying Regulatory
Elements in mRNA
Scott A.Tenenbaum
Ge*NY*Sis Center for Excellence in Cancer Genomics
University at Albany-SUNY
Abstract:
We have used methods for purifying endogenously formed mRNP complexes and identifying
their associated mRNA targets using microarray technologies (ribonomic profiling).
This has enabled the genomic-scale identification of many mRNA targets of RNA-binding
proteins (RBPs) and has provided new insights into the principles governing post-transcriptional
gene regulation. Using Affymetrix tiling arrays for human chromosomes 21-22, we
have extended our findings and determined the associations of both coding and noncoding
RNAs for several RBPs including HuR, IMP, La and PABP. Tiling arrays are designed
to exhaustively span a designated genetic sequence in a high-density manner to include
all coding and noncoding regions exclusive of highly repeated regions. This allows
for the exhaustive and unbiased analysis of mRNP associated RNA, annotated and un-annotated,
as well as the identification of alternative spliced products and there association
with RBPs. The tiling array platform used in this study interrogates on average,
every 35 bases of the approximately 35 million base pairs of chr 21-22 (Kapranov
et al., 2002). Previously, using tiling arrays, it was unexpectedly observed that
a great deal more genomic sequence was transcribed into RNA than can currently be
accounted for using our present annotation. Limited analysis of these novel transcripts
revealed that they possess little protein coding potential and frequently occupy
an antisense orientation relative to well-characterized coding transcripts. By combining
ribonomic profiling with tiling arrays, our studies indicate that in addition to
targeting predicted mRNAs, many of the noncoding RNAs expressed from the genome
also appear to be associated with RBPs in a specific and selective manner. In addition
to having significant uniqueness in exonic, intronic and novel RNA specificity,
we also observe potentially meaningful overlaps in the RNA subset affinities of
the RBPs that we targeted. The UTRs of many mRNAs contain sequence and structural
motifs that are used to regulate the stability, localization and translatability
of mRNA. Unfortunately, the consensus sequences for these motifs frequently have
significant variability and are only loosely characterized making the use of simple
alignment tools inadequate for the discovery of new RNA regulatory motifs. Additionally,
many software tools utilize adaptive techniques requiring training. We have generated
a collection of positive control Training UTR datasets called “the UAlbany
TUTR collection” which is meant to be used as blind training/test sets that
contain a previously characterized RNA motif conforming to a defined consensus.
The basic training sets have been generated with associated indexes and "answer
sets" produced to identify where the previously characterized RNA motif (e.g.
the IRE, ARE, SECIS, etc.) resides in each sequence. The UAlbany TUTR collection
is meant to be a shared resource and has been made available to a number of researchers
for software testing. The strengths and weaknesses of different algorithms to successfully
identify different consensus motifs will be presented. Additionally, we are presently
developing customized tiling array based ribonomic profiling technology which enables
the genomic-scale foot-printing of RBP binding sites. Examples of this technology
will also be discussed.
Monday, May 9, 2005
4:00 pm
70 Ship Street, Room 107
_______________________________________
Gene Regulation and Probabilistic Graphical Models
Nir Friedman
Hebrew University, Israel
Currently visiting at the Bauer Center for Genomics Research and the Division of
Engineering and Applied Science at Harvard University
Abstract:
High-throughput genome-wide molecular assays have become central to molecular biology.
These assays probe cellular networks from different perspectives and provide rich
and diverse data, posing the challenge of developing methodologies for extracting
meaningful biological insights. The challenge for computational biology is to provide
methodologies for transforming high-throughput heterogeneous datasets into biological
insights about the underlying mechanisms. Integration of data from assays that examine
cellular systems from different viewpoints can lead to a more coherent reconstruction,
reduce the effects of noise, and provide new knowledge about the relevant biological
entities and processes. One class of approaches to answer this challenge builds
on probabilistic graphical models. Such models provide a concise representation
of complex cellular networks models by composing simpler sub-models. Procedures
based on well understood principles for inferring such models from data facilitate
a model—based methodology for analysis and discovery. In this talk I will
attempt to discuss few recent projects that use the language of probabilistic graphical
models to model and understand gene regulation and function from genomics datasets.
Wednesday, April 27, 2005
4:00 pm
McMillan Hall, Room 115
______________________________________________________ Events
|