CCMB Seminar
Series 2009-2010
_______________________________________________________ Events
To receive CCMB seminar
announcements by email, sign up for the computational
biology mailing list by sending email to listserv@listserv.brown.edu with
the message body "subscribe computational-biology"
CCMB
Lecture Series |
Peter Olofsson
Associate Professor
Trinity University, San Antonio, Texas
Modeling Growth and Telomere Dynamics in Yeast |
|
Telomeres are regions at the ends of chromosomes, serving as protective
buffers against DNA damage. As chromosomes divide telomeres shorten progressively,
a process which is counteracted by the enzyme telomerase which
adds telomeric DNA to chromosomal ends. In the absence of telomerase,
cells eventually stop dividing and in the presence of telomerase, cells divide
indefinitely. Telomere biology is a very active and fruitful field of research
with relevance to problems regarding aging and cancer research. Its importance
was highlighted by the 2009 Nobel Prize in Physiology or Medicine
which was awarded to three telomere biologists.
Telomeres have been extensively studied in the yeast Saccharomyces cerevisiae
which has given much insight into eukaryotic genetics. One particular
observation that has been made is that some yeast cells that lack telomerase
and would therefore normally eventually stop dividing, keep dividing nevertheless,
indicating that they develop alternative ways of maintaining telomere
length. A general branching process is proposed to model a population of
yeast cells following loss of telomerase. The model takes into account random
variation in individual cell cycle times, telomere length, finite lifespan
of mother cells, and survivorship. We identify and estimate crucial parameters
such as the probability of an individual cell becoming a survivor, and
compare our model predictions to experimental data.
Wednesday, May 19, 2010 (Note the changed date!)
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Suzanne Sindi
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Cenk Sahinalp
Simon Fraser University
School of Computing Science
Structural Variation Discovery in High Throughput Sequenced Genomes and Transcriptomes |
|
Recent studies show that along with single nucleotide polymorphisms and small indels, larger structural differences contribute significantly to human genetic diversity. The realization of new ultra-high-throughput sequencing platforms has made it feasible to detect the full spectrum of genomic variation among many individual genomes, including those between healthy tissues and those susceptible to disease with genomic origin. Conventional algorithms for identifying structural variation (SV) have not been designed to handle the short read lengths and the errors implied by the available and future high throughput sequencing technologies. In this talk we will provide combinatorial formulations for the SV detection between a reference genome and a high throughput paired-end sequenced individual genome. We will provide efficient algorithms for each of the formulations we give, which all turn out to be fast and quite reliable; they are also applicable to all currently available sequencing methods and traditional capillary sequencing technology.
Tuesday April, 20, 2010
12:00 pm (PLEASE NOTE SPECIAL DAY AND TIME!)
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Ben Raphael
Refreshments will be served at 11:45 am
CCMB
Lecture Series |
Juliette de Meaux
Max Planck Institute for Plant Breeding Research
Molecular Underpinning of Life-History Evolution in Arabidopsis thaliana |
|
Abstract TBA
Monday, March 15, 2010 (Note the changed date, time, and place!)
2:00pm
505 BioMed Center
Hosted by: Dan Weinreich
Refreshments will be served at 1:45 pm
CCMB
Lecture Series |
Eleazar Eskin. Ph.D.
University of California, Los Angeles
Department of Computer Science
Leveraging linkage disequilibrium structure in genome-wide association studies |
|
Variation in human DNA sequences account for a significant amount of
genetic risk factors for common disease such as hypertension,
diabetes, Alzheimer's disease, and cancer. Identifying the human
sequence variation that makes up the genetic basis of common disease
will have a tremendous impact on medicine in many ways. Recent
efforts to identify these genetic factors through large scale
association studies which compare information on variation between a
set of healthy and diseased individuals have been remarkably
successful. However, despite the success of these initial studies,
many challenges and open questions remain on how to design and analyze
the results of association studies. Many of these challenges
involving taking advantage of linkage disequilibrium or correlation
structure of human variation. In this talk, I will discuss a few of
the computational and statistical challenges in the design and
analysis of association studies.
Wednesday, March 3, 2010
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Ben Raphael
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Art Covert
Michigan State University
The Hidden Lives of Deleterious Mutations: Transiting fitness valleys via sign-epistatic stepping stones |
|
The role of deleterious mutations in evolution has been much debated. While many researchers believe that any mutation that reduces fitness must impede adaptive evolution, recent studies have shown that this is not always the case. Deleterious mutations may have their fitness effects reversed by a second, sign-epistatic mutation, which can also allow populations to pass through fitness valleys. It is unknown if these sign-epistatic recoveries are fortuitous accidents, or a driving force behind evolution. Using digital organisms, I compared the progress of adaptive evolution when all deleterious mutations were immediately reverted with control treatments in which they were allowed to enter the population. Deleterious mutations reduce fitness over the short term, by definition, and they comprise the majority of mutations in populations of digital organisms, as in biological ones. In my experiments, long-term adaptive evolution was accelerated in those populations in which deleterious mutations were allowed to remain, because some of them served as stepping stones across otherwise impassible fitness valleys, thereby facilitating the evolution of complex features.
Wednesday, January 27, 2010
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Refreshments will be served at 3:45 pm
Joint CCMB/MPPB/Psychiatry Seminar |
Jason Moore
Dartmouth Medical School
Bioinformatics Challenges for Genome-Wide Association Studies |
|
Human genetics is currently dominated by the genome-wide association study (GWAS) that measures and evaluates one million or more single nucleotide polymorphisms (SNPs) for their disease associations. The current biostatistical paradigm is to analyze each SNP individually without regard to the rest of the genome or environmental exposure.
This agnostic or unbiased approach has not been successful for identifying SNPs with moderate or large effects on disease susceptibility. We present here an alternative bioinformatics strategy for GWAS analysis that focuses on gene-gene and gene-environment interactions and their context in biochemical pathways.
Wednesday, November 18, 2009
3:00pm
LMM - 70 Ship Street, Room 107
Refreshments will be served at 2:45 pm
CCMB
Lecture Series |
Eli Stahl
Brigham and Women's Hospital
The Present and Future of Genome-wide Association Studies in Rheumatoid Arthritis |
|
Results and current progress of a large-scale case-control genome-wide association study (GWAS) of rheumatoid arthritis (RA) shed further light on this autoimmune disease, and help to frame a broad perspective on mapping complex traits. Genotypes at over 2.5 million common single nucleotide polymorphisms (SNPs) were tested for association with RA in 5539 cases and 20169 controls of European descent. Eleven new RA risk alleles replicate in additional samples. Conditional and haplotype analyses refine the association signal in several loci with evidence for multiple independent effects in autoimmunity. Still, all common variant associations validated to date together explain relatively little of the additive genetic variance for RA, and suggest major contributions of (1) many more common variants of very small effect, (2) copy number or other kinds of variants, (3) rare variants, and/or (4) non-additive genetic, epigenetic or non-genetic effects. A polygenic risk score analysis can allow inference of the remaining effect due to common variants en masse (scenario 1, with some implications for scenarios 2 and 3). The direct benefit of current and future common-variant GWAS is limited under all of these scenarios, but GWAS certainly inform complimentary approaches including deep re-sequencing in case-control cohorts, and integrated clinical/functional and genetic analyses.
Wednesday, October 28, 2009
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Daniel Weinreich
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Yosef E. Maruvka
Bar-Ilan University
Genetic polymorphism and demography: a statistical mechanics approach |
|
The recent progress in sequencing techniques has been followed by an exponential growth in the amount of available genetic data. Traditional methods of analysis require exact reconstruction of the phylogenetic tree, and therefore cannot deal with these immense databases. Given the efficiency of "mean field" approximation in physical systems with many particles, we are applying the same techniques and concepts to genetic problems (where it turns out that 50 can be many).
The inferring of past demographic parameters from current polymorphism data will be discussed for two examples:
1. Retrieval of the effective population size and its growth rate using the number of lineages as a function of time. Here the mean-field method has been found to be an unbiased estimator, unlike the existing methods, and with a smaller error range.
2. The difference between additive noise and multiplicative noise, a basic concept in statistical mechanics, can be used to determine the ongoing debate between the adaptive and the neutral (Hubbell's) theories of biodiversity.
Wednesday, October 21st, 2009
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Daniel Weinreich
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Mark A. DePristo, Ph.D.
Broad Institute of Harvard and MIT
Discovering genetic variation in 1000 Genomes: from mapping reads to putative de novo mutations |
|
The 1000 genomes project aims to discover and characterize all common human genetic variation with a minor allele frequency (MAF) = 0.5%. The pilot phase of the project was completed in June producing five terabases of Illumina/Solexa, SOLiD, and Roche/454 sequences in ~180 individuals sequenced to ~4x average depth genome-wide in three populations, 30-60x whole-genome sequence for two mother, father, daughter trios, and ~800 individuals with 50x+ coverage using hybrid capture in 1000 randomly-selected genes.
Here we describe the sequence calibration, realignment, and analysis tools we developed at the Broad to discover with high sensitivity and specificity single-nucleotide (SNPs) and short (< 20bp) insertion/ deletion (indels) polymorphisms in all three wings of the pilot phase of the 1000 genomes project. We assess our approach by comparing discovered variation among technologies, across pilot arms, to population genetic expectations and to complementary efforts from other groups participating the 1000 genomes project. Finally, we subject a randomly selected subset of SNP and indel calls to experimental validation to estimate project- wide specificity rates. We highlight best practices and lessons learned on the production and analysis of next-generation sequencer data.
Wednesday, October 14, 2009
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Daniel Weinreich
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Lee A. Newberg, Ph.D.
Wadsworth Institute
Getting statistical significance and Bayesian confidence limits for your hidden Markov model or score-maximizing dynamic programming algorithm, with pairwise alignment of
nucleotide sequences as an example.
|
|
Hidden Markov models and score-maximizing dynamic programming algorithms are employed for the evaluation of sequential data in a variety of scientific fields, including linguistics, vision, and computational biology. Given a hidden Markov model, efficient "Viterbi" and "forward" algorithms are used to evaluate the probability that the model would generate a given sequence of observations, and similar approaches are employed in the dynamic programming algorithms where the focus is on finding high scores instead of high probabilities. Here we present modifications to the "forward" algorithm that allow additional computations. We can efficiently estimate statistical significance: what is the probability that a randomly generated sequence will score at least as high as the observed sequence does? (We've computed answers down to 1e-4000.) We can also compute how typical a sequence is: for every whole number d, what is the probability that a sequence generated by the hidden Markov model will have exactly d
differences from the observed sequence?
Wednesday, October 7, 2009
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Charles (Chip) Lawrence
Refreshments will be served at 3:45 pm
CCMB
Lecture Series |
Alexandros Stamatakis
Technische Universität München
Department of Computer Science
Mapping the Phylogenetic Likelihood Kernel to Emerging Parallel
Computer Architectures |
|
The phylogenetic likelihood function is the by far most compute- intensive part
of every ML-based phylogenetic inference algorithm. I will present several
solutions for appropriately adapting this computational kernel to a variety of
accelerator and supercomputer architectures ranging from FPGAs up to
massively parallel machines like the BG/L. I will also address load-balancing
problems in the kernel and a study on single- versus double-precision
arithmetics trade-offs. Moreover, I will introduce a basic categorization of
input datasets into well-shaped and badly-shaped alignments that require
distinct algorithmic and parallelization approaches.
Finally, I will address an algorithm for rapid phylogenetic placement/
identification of short reads from environmental samples.
Wednesday, June 10, 2009
4:00pm
CIT Bldg, Room 241, SWIG Boardroom
Hosted by: Casey Dunn
Refreshments will be served at 3:45 pm
_______________________________________________________ Events
|