Brown University Center for Computational Molecular Biology


CCMB Distinguished Lectures Series 2007-2008

_______________________________________________________ Events

CCMB Distinguished Lecture Series

Jim Collins

Center for BioDynamics
Department of Biomedical Engineering
Boston University

Engineering Gene Networks:
Integrating Synthetic Biology & Systems Biology

Many fundamental cellular processes are governed by genetic programs which employ protein-DNA interactions in regulating function. Owing to recent technological advances, it is now possible to design synthetic gene regulatory networks, and the stage is set for the notion of engineered cellular control at the DNA level.
Theoretically, the biochemistry of the feedback loops associated with protein-DNA interactions often leads to nonlinear equations, and the tools of nonlinear analysis become invaluable. In this talk, we describe how techniques from nonlinear dynamics and molecular biology can be utilized to model, design and construct synthetic gene regulatory networks. We present examples in which we integrate the development of a theoretical model with the construction of an experimental system. We also discuss the implications of synthetic gene networks for biotechnology, biomedicine and biocomputing. In addition, we present integrated computational-experimental approaches that enable construction of first-order quantitative models of gene-protein regulatory networks using only steady-state expression measurements and no prior information on the network structure or function. We discuss how the reverse-engineered network models, coupled to experiments, can be used: (1) to gain insight into the regulatory role of individual genes and proteins in the network, (2) to identify the pathways and gene products targeted by pharmaceutical compounds, and (3) to identify the genetic mediators of different diseases.

Wednesday, November 7th, 2007
CIT Building, Room 227

CCMB Distinguished Lecture Series

James Yorke, Ph.D.

Distinguished University Professor of Mathematics and Physics

Institute for Physical Sciences and Technology (IPST)
University of Maryland

Determining the DNA sequence,
a billion dollar logic puzzle

The genome of an individual is the collection of DNA in each of his/her/its cells. It can be expressed as one or more sequences of the letters A, C, G, T. For mammals the genome has about 3 billion letters while for a bacteria it has a couple million. The dominant method used for determining the sequence is called whole genome shotgun assembly. Using this method, The National Institutes of Health has spent about one billion dollars determining genomes of many species in the past five years. Parts of genome turn out to be easier to determine than other parts but overall each genome becomes a giant jigsaw puzzle. At the University of Maryland, we try to find techniques for solving as much of the puzzle as possible. The most difficult parts of puzzle to assemble are often the parts that have been mutating the most in the recent millions of years. We are also trying to determine the patterns of repeats.

Monday, October 15th, 2007
4:00 pm, CIT Building, Room 241 ~ SWIG Boardroom
Hosted by: Suzanne Sindi

CCMB Distinguished Lecture Series

Nancy Amato, Ph.D

Parasol Lab, Department of Computer Science
Texas A&M University

Using Motion Planning to Study Molecular Motions

Protein motions, ranging from molecular flexibility to large-scale conformational change, play an essential role in many biochemical processes. For example, some devastating diseases such as Alzheimer's and bovine spongiform encephalopathy (Mad Cow) are associated with the misfolding of proteins. Despite the explosion in our knowledge of structural and functional data, our understanding of protein movement is still very limited because it is difficult to measure experimentally and computationally expensive to simulate.

In this talk we describe a method we have developed for modeling protein motions that is based on probabilistic roadmap methods (PRM) for motion planning. Our technique yields an approximate map of a protein's potential energy landscape and can be used to generate transitional motions of a protein to the native state from unstructured conformations or between specified conformations. We describe a method based on rigidity theory that allows us to sample conformation space more efficiently than our initial sampling strategy and enables us to study a broader range of motions for larger proteins and new analysis tools that enable us to extract kinetics information, such as folding rates. For example, we show that rigidity-based sampling results in maps that capture subtle folding differences between protein G and its mutations, NuG1 and NuG2, and we illustrate how our technique can be used to study large-scale conformational changes in calmodulin, a 148 residue signaling protein known to undergo conformational changes when binding. More information regarding our work, including an archive of protein motions generated with our technique, are available from our protein folding server:

Wednesday, October 10th, 2007
4:00pm, CIT Bldg, Room 241 ~ SWIG Boardroom

Hosted by: Franco P. Preparata
Refreshments will be served at 3:45pm

CCMB Distinguished Lecture Series

Stephen Altschul
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health

"Protein Sequence Database Searches Using Compositionally Adjusted Amino Acid Substitution Matrices"

Stephen Altschul

Abstract: Standard amino acid substitution matrices are constructed as log-odds ratios from large collections of alignments of related proteins.  Any such collection has an implicit "standard" set of amino acid background frequencies. The matrices produced, however, often are used to compare proteins with quite non-standard amino acid compositions. We argue on theoretical grounds that this is inappropriate, and have described a method for transforming a standard matrix into one appropriate for comparing proteins with any non-standard compositions.  Compositionally-adjusted matrices yield improved results from the twin perspectives of alignment score and alignment quality when proteins with strongly biased compositions are compared.

To what extent are such adjusted matrices of utility for general purpose protein database searches?  Using standard test platforms, we compared a standard matrix to compositionally-adjusted matrices, with relative entropy left unconstrained, or constrained in various ways.  We found that constraining the relative entropy of the compositionally adjusted matrix to a fixed value in the new compositional context generally produced the best results. We also found that if the sequences compared are not known to have strong compositional biases, then it is still on average advantageous to use an adjusted matrix when the sequences satisfy certain simple length or compositional inequalities. Applying these findings to general-purpose database searches can lead to a significant improvement in retrieval performance, with a minimal increase in execution time.

Wednesday, April 9th, 2008
4:00 pm
CIT Building, Room 241 – SWIG Boardroom
Hosted by: Charles E. Lawrence

CCMB Distinguished Lecture Series

Jun Liu

Department of Statistics
Harvard University

"Inference of Patterns and Associations Using Dictionary Models"

Jun Liu

Abstract: Pattern discovery is a ubiquitous problem in many disciplines. It is especially prominent in recent years due to our greatly improved data-generation capabilities in science and technologies. The method I present here is motivated by the "motif-finding" and "module-finding" problems in biology, i.e., to find sequence patterns (i.e., "words") that seem to appear more frequent than usual in a given set of text sequences (i.e., sentences) and to find which of these "words" tend to co-occur in a sentence. A challenge in the motif-finding problem is that there are no spacings and punctuations between the words and the dictionary of "words" is unknown to us. Existing methods are mostly "bottom-up" approaches, i.e., to build up the dictionary starting with single-letter words and then concatenate some existing words that appear to occur next to each other in sentences more frequently than chance. Our new approach is a top-down strategy, which uses a tree structure to represent the relationship among all possible existing words and uses the EM algorithm to estimate the usage frequency of each word. It automatically trims down most of the incorrect "words" by letting their usage frequencies converge to zero.

The module-finding problem is closely related to the well-known "market basket" problem, in which one attempts to mine association rules among the items in a supermarket based on customers' transaction records.  It is also related to the two-way clustering problem. In this problem, we assume that the words are given, and our goal is to find subsets of words that tend to co-occur in a sentence.

We call the set of co-occurring words (not necessarily orderly) a "theme" or a "module". We can generalize the dictionary model to the "theme"-model and use a similar EM-strategy to infer these themes. I will demonstrate its applications in a few examples including an analysis of chinese medicine prescriptions and an analysis of a chinese novel.

This is based on a joint work with Ke Deng and Zhi Geng.

Wednesday, April 23rd 2008
4:00 pm
CIT Building, Room 241 – SWIG Boardroom
Hosted by: Charles E. Lawrence
Refreshments will be served at 3:45 pm

CCMB Distinguished Lecture Series

Joe W. Gray

Staff Scientist/Division Director
Lawrence Berkeley National Laboratory UCSF Comprehensive Cancer Center

"A Systems Approach to Marker Guided Therapy in Breast Cancer"

Joe W. Gray

Joe W. Gray, Ph.D., is Associate Laboratory Director for Life and Environmental Sciences and Life Sciences Division Director at the Lawrence Berkeley National Laboratory (LBNL).  He is also Adjunct Professor of Laboratory Medicine at the University of California, San Francisco (UCSF) and program leader for the Breast Oncology Program in the UCSF Comprehensive Cancer Center.  Dr. Gray's current research program focuses on in molecular analysis technology, identification of genomic aberrations that contribute to cancer pathophysiology, development of efficient strategies for enhanced marker guided cancer therapy –especially for breast and ovarian cancer and early breast cancer detection.  His work is described in more than 330 publications and 50 patents.  Major awards include the Radiation Research Society Research Award (1985), the E.O. Lawrence Award from the US Department of Energy (1986), Election as a Fellow of the American Association for the Advancement of Science (1996), the Curt Stern Award from the American Society for Human Genetics (2001), an Honorary Doctorate from the University of Tampere, Tampere, Finland (2005) a DOD Innovator Award (2007) and the Brinker Award (2007). Dr. Gray earned a Ph.D. in physics from Kansas State University.

Wednesday, April 30th, 2008
Sidney Frank Hall, Room 220
Hosted by: Ben Raphael and John Sedivy

CCMB Distinguished Lecture Series

Gad Kimmel
University of California, Berkeley

Faculty Candidate
Center for Computational Molecular Biology

"Computational Problems in Human Genetics"

Abstract: The question how genetic variation and personal health are linked is one of the compelling puzzles facing scientists today.  The ultimate goal is to exploit human variability to find genetic causes for multi-factorial diseases such as cancer and coronary heart disease. Recent technology improvement enables the typing of millions of single nucleotide polymorphisms (SNPs) for a large number of individuals.  Consequently, there is a great need for efficient and accurate computational tools for rigorous and powerful analysis of these data.  In my talk I am going to concentrate on two computational problems, which are an essential step in studying the data obtained by this technology: Accurate and efficient significance testing with a correction for population stratification and estimating local ancestries in admixed populations.

Wednesday, May 14th, 2008
4:00 pm
182 George Street, Applied Math Building – Room #110
Hosted by: Charles E. Lawrence

CCMB Distinguished Lecture Series

Fumei Lam
Brown University

"Imperfect Ancestral Recombination Graph Reconstruction Problem: A Hierarchy of Upper Bounds"

Abstract: Reconstruction of evolutionary histories is a fundamental problem in computational biology.  It has been established that accurately representing complete evolutionary histories requires an underlying model that incorporates non-tree operations, corresponding to the mixing of genetic material from ancestral sequences.  In this talk, we will address the problem of finding parsimonious evolutionary histories with both hybridization and mutation events (the Imperfect Ancestral Recombination Graph Reconstruction Problem).  The power of our framework is the connection between our formulation and the Directed Steiner Arborescence Problem in combinatorial optimization.  We implement linear programming techniques as well as heuristics for the Directed Steiner Arborescence Problem and apply these algorithms on simulated and benchmark data sets.

This is joint work with Ryan Tarpine and Sorin Istrail.

Wednesday, May 21, 2008
4:00 pm

CIT Building, Room 241, SWIG Boardroom
Hosted by: Sorin Istrail

Refreshments will be served at 3:45 p.m.


Brown Homepage Brown University