Seminar Archive

  • Mar
    13
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar and Charles K. Colver Lectureship Series | Mahlet G. Tadesse, Ph.D.

    School of Public Health at Brown University, 121 south Main Street, Providence, RI 02912, Rm 245

    Mahlet G. Tadesse, Ph.D.,
    Professor and Chair
    Department of Mathematics & Statistics
    Georgetown University

     

    Talk Title: Variable selection in mixture models: Uncovering cluster structures and relevant features

    Abstract: Identifying latent classes and component-specific relevant predictors can shed important insights when analyzing high-dimensional data. In this talk, I will present methods we have proposed to address this problem in a unified manner by combining ideas of mixture models and variable selection. In particular, I will discuss (1) an integrative model to relate two high-dimensional datasets by fitting multivariate mixture of regression models using stochastic partitioning, and (2) a mixture of regression trees approach to uncover homogeneous subgroups of observations and their associated predictors accounting for non-linear relationships and interaction effects. I will illustrate the methods with genomic applications.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Mar
    6
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar and Charles K. Colver Lectureship Series | Sudipto Banerjee, Ph.D.

    School of Public Health at Brown University, 121 south Main Street, Providence, RI 02912, Rm 245

    Sudipto Banerjee, Ph.D.,
    Professor and Chair
    Department of Biostatistic
    University of California, Los Angeles

     

    Talk Title: Bayesian Hierarchical Modeling and Inference for High-Resolution Actigraph Data Using Wearable Devices

    Abstract: Rapid developments in streaming data technologies have enabled real-time monitoring of human activity. Wearable devices, such as wrist-worn sensors that monitor gross motor activity (actigraphy), have become prevalent. An actigraph unit continually records the activity level of an individual, producing large amounts of high-resolution measurements that can be immediately downloaded and analyzed. While this type of BIG DATA includes both spatial and temporal information, we argue that the underlying process is more appropriately modeled as a stochastic evolution through time, while accounting for spatial information separately. A key challenge is the construction of valid stochastic processes over paths. We devise a spatial-temporal modeling framework for massive amounts of actigraphy data, while delivering fully model-based inference and uncertainty quantification. Building upon recent developments in scalable inference, we construct temporal processes using directed acyclic graphs (DAG) and develop optimized implementations of collapsed Markov chain Monte Carlo (MCMC) algorithms for Bayesian inference. We test and validate our methods on simulated data and subsequently apply and verify their predictive ability on an original dataset from the Physical Activity through Sustainable Transport Approaches (PASTA-LA) study conducted by UCLA’s Fielding School of Public Health.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Dec
    5
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Eric J. Daza, Dr.P.H

    School of Public Health, 121 South Main Street, Rm 245

    Eric J. Daza, Dr.P.H,
    Lead Biostatistician and Health Data Scientist
    Evidation

    Talk Title: Using Wearables and Apps to Characterize Your Own Recurring Average Treatment Effects

    Abstract: Temporally dense single-person “small data” have become widely available thanks to mobile apps (e.g., that provide patient-reported outcomes) and wearable sensors. Many caregivers and self-trackers want to use these intensive longitudinal data to help a specific person change their behavior to achieve desired health outcomes. Ideally, this involves discerning possible causes from correlations using that person’s own observational time series data. In paper one, we estimate within-individual average treatment effects of sleep duration on physical activity, and vice-versa. We introduce the model-twin randomization (MoTR; “motor”) and propensity score twin (PSTn; “piston”) methods for analyzing Fitbit sensor data. MoTR is a Monte Carlo implementation of the g-formula (i.e., standardization, back-door adjustment); PSTn implements propensity score inverse probability weighting. They estimate idiographic stable recurring effects, as done in n-of-1 trials and single case experimental designs. We characterize and apply both methods to the two authors’ own data, and compare our approaches to standard methods (with possible confounding) to show how to use causal inference to make truly personalized recommendations for health behavior change. In paper two, we apply MoTR to the three authors, thereby providing a guide for using MoTR to investigate your own recurring health conditions—and demonstrating how any suggested effects can differ greatly from those of other individuals.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Talk Title: Delphi’s COVIDcast Project: Lessons from Building a Digital Ecosystem for Tracking and Forecasting the Pandemic

    Abstract: In March 2020, the Delphi group at Carnegie Mellon University launched an effort called COVIDcast, which has many parts: 1. unique relationships with partners in tech/healthcare, granting us access to data on pandemic activity; 2. infrastructure to build real-time, geographically-detailed COVID-19 indicators from this data; 3. a historical database of all indicators, including revision tracking; 4. a public API, serving new indicators daily; 5. interactive graphics to display our indicators; 6. nowcasting and forecasting models building on the indicators. This talk gives a high-level summary, with discussion of some lessons learned.

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Nov
    21
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Alberto Abadie, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Alberto Abadie, Ph.D.,
    Professor of Economics
    Massachusetts Institute of Technology

    Talk Title: Synthetic Controls for Experimental Design

    Abstract: This article studies experimental design in settings where the experimental units are large aggregate entities (e.g., markets), and only one or a small number of units can be exposed to the treatment. In such settings, randomization of the treatment may induce large ex-post estimation biases under many or all possible treatment assignments. We propose a variety of synthetic control designs (Abadie and Gardeazabal, 2003; Abadie, Diamond, and Hainmueller, 2020) as experimental designs to select treated units in non-randomized experiments with large aggregate units, as well as the untreated units to be used as a control group. Average potential outcomes are estimated as weighted averages of treated units for potential outcomes with treatment, and control units for potential outcomes without treatment. We analyze the properties of such estimators and propose new inferential techniques. We conduct extensive simulations to study the performance of different synthetic control designs.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Nov
    7
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Benjamin French, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Talk Title: Acceleration of residual lifetime among survivors of the atomic bombings of Japan

    Abstract: The Life Span Study of atomic bomb survivors includes residents of Hiroshima and Nagasaki, Japan, who were located within 10 km of the hypocenter at the time of the bombings in 1945, and a matched sample of Hiroshima and Nagasaki residents who were not in either city at the time of the bombings. Radiation risk estimates from the Life Span Study are used to inform policies for radiological protection in occupational, medical, and public health settings, as well as for survivors’ welfare. The association between radiation dose and mortality is known to depend on the age at which the survivor was exposed; however, capturing this dependence using standard survival analysis methods such as the Cox model can be challenging. We therefore extend the accelerated failure time model to quantify the association between radiation dose and acceleration of residual survival at the age at which the survivor was exposed. Parametric estimation methods provide point estimates and confidence intervals for the relative mean (or median) residual survival as a function of radiation dose for a given age at exposure compared to an unexposed group that survived to the same age. Analysis of the Life Span Study data reveal a complex relationship between radiation dose and residual survival, such that radiation risk appears more pronounced among younger survivors.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Oct
    31
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Hao Wu, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Talk Title: Supervised cell type identification for single cell ATAC-seq data

    Abstract: Computational cell type identification (celltyping) is a fundamental step in single-cell omics data analysis. Supervised celltyping methods have gained increasing popularity in single-cell RNA-seq data because of the superior performance and the availability of high-quality reference datasets. Recent technological advances in profiling chromatin accessibility at single-cell resolution (scATAC-seq) have brought new insights to the understanding of epigenetic heterogeneity. With continuous accumulation of scATAC-seq datasets, supervised celltyping method specifically designed for scATAC-seq is in urgent need. In this talk, I will mainly present our recent method developments on supervised celltyping for scATAC-seq using scATAC-seq data as reference. We developed Cellcano, a novel computational method based on a two-round supervised learning algorithm. The method alleviates the distributional shift between reference and target data and significantly improves the prediction performance. Time allows, I will also present our on-going work on using scRNA-seq data as reference.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Oct
    24
    Virtual
    12:00pm - 1:00pm

    Statistics Seminar | Richard Cook, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Talk Title: Defining and addressing dependent observation schemes in life history studies

    Abstract: Multistate models provide a powerful framework for the analysis of life history processes when the goal is to characterize transition intensities, transition probabilities, state occupancy probabilities, and covariate effects thereon. Data on such processes are often only available at random visit times occurring over a finite period. We formulate a joint multistate model for the life history process, the recurrent visit process, and a random loss to followup time at which the visit process terminates. This joint model is helpful when discussing the independence conditions necessary to justify the use of standard likelihoods involving the life history model alone, and provides a basis for analyses that accommodate dependence. We consider settings with disease-driven visits occur in combination with routinely scheduled visits, and develop likelihoods that accommodate partial information on the types of visits. Simulation studies verify that it is possible to fit joint models to mitigate biases that otherwise arise from dependent visit processes. Biases arising from intermittent observation of time-dependent biomarkers, and selection biases from dependent left-truncation will also be considered if time permits, along with strategies for fitting suitable multistate models to mitigate biases. This talk is based on joint work with Jerry Lawless.

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Oct
    17
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Ronghui (Lily) Xu, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Talk Title: Doubly Robust Estimation under the Marginal Structural Cox Model

    Abstract: The marginal structural Cox model (Cox MSM) has been widely used to draw causal inference from observational studies with survival outcomes. The typical estimation approach under the Cox MSM is inverse probability of treatment weighting (IPTW) using a propensity score (PS) model, which is known to be inconsistent if the propensity score model is misspecified. Effort to protect against such model misspecification involves augmentation, which has been a challenge in the past due to the non-collapsibility of the Cox regression model. In this work we develop an augmented inverse probability weighted (AIPW) estimator with doubly robust properties including rate doubly robust, that enables us to use machine learning and a large class of nonparametric methods, to overcome the non-collapsibility challenge. We study both the theoretical and empirical performance of the augmented inverse probability weighted estimator. Time permitting we will also discuss informative censoring under the Cox MSM, where the inverse probability of censoring weighting (IPCW) is needed and the augmentation of that leads to an AIPW estimator that protects against the misspecification of both the PS and the censoring models.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research
  • Oct
    3
    Virtual and In Person
    12:00pm - 1:00pm

    Statistics Seminar | Peter Mueller, Ph.D.

    School of Public Health, 121 South Main Street, Rm 245

    Talk Title: Bayesian Nonparametric Common Atoms Regression for Generating Synthetic Controls in Clinical Trials

    Abstract: We develop a Bayesian nonparametric approach for creating synthetic controls from real world data (RWD) to supplement treatment-only single arm trials. We introduce a Bayesian common atoms regression model that clusters covariates with similar values across different treatment arms. Exploiting the common atoms structure, we propose a density free importance sampling scheme to sample a subpopulation of the RWD such that the covariates in the subpopulation have the same distribution as the actual patients, allowing for a valid treatment comparison. Inference under the proposed common atoms mixture model can be characterized as a stochastic stratification by propensity score (for selection into control or treatment arm). The proposed design is implemented for glioblastoma trials.

    *Light refreshments will be served

    More Information Biology, Medicine, Public Health, BioStatsSeminar, Research