Wednesday, April 11, 2018 4:00pm - 5:00pm
Watson CIT - SWIG Boardroom (CIT241)
Lane Fellow, Computational Biology Department
School of Computer Science, Carnegie Mellon University
"Efficient Algorithms for Large-Scale Transcriptomics and Genomics"
I will present modeling and algorithmic designs for two challenging problems in biology and show that efficient computational methods enable significant advances in our understanding of cell machinery and genome evolution. The first problem is the assembly of full-length transcripts -- the collection of expressed gene products in cells -- from noisy and highly fragmented data obtained through RNA sequencing. I first formulate this problem as a graph decomposition problem, and then design an efficient algorithm for it, which can guarantee to preserve all long-range information. Integrating and assembling 7000 RNA-seq samples using this algorithm yields a more complete human transcriptome and reveals many potential novel transcripts. The second problem is the reconstruction of a large phylogeny -- the evolutionary history of a large collection of extant species -- based on the structures of the genome as obtained from whole-genome sequencing technology. A basic computational problem here is to define and to compute an evolutionary distance between whole genomes. I will describe our efficient exact algorithms and approximation algorithms to compare genomes under various evolutionary models. These algorithms can uncover the evolutionary relationships between genes across many genomes, even for very large mammalian genomes.
Hosted by the CCMB and CS