Doctoral Program

The primary mission of the doctoral program in Biostatistics is to provide the training necessary to carry out independent research in theory, methodology and application of statistics to important problems in biomedical research, including research biology, public health and clinical medicine.  All students in the doctoral program in Biostatistics are required to demonstrate mastery of advanced biostatistical methods which is assessed via coursework and examinations.

 

The PhD Program in Biostatistics will provide training in:

  • Develop new quantitative methods and underlying theory
  • Make innovative applications to substantive and demanding scientific problems
  • Lead and participate in interdisciplinary research involving public health, medicine, biology, and the social sciences

Twenty-four credits are required of students matriculating in the program without a master's degree; 16 are required beyond the master's.  For those with a related master's degree, up to eight units can be transferred.  Students are expected to participate in academic activities such as the Statistics Seminar and faculty-organized working groups.

Within the Department, the major requirements for the PhD are:

  1. completion of a program of courses covering core areas of required expertise (see details below)
  2. demonstration of proficiency in teaching
  3. synthesis of a core body of knowledge, evaluated via written examination
  4. demonstration of readiness to undertake original research, via oral presentation and defense of a written dissertation proposal (oral exam)
  5. completion and oral defense of a dissertation that makes an original contribution in the chosen field of study.

The methods for meeting these requirements may differ depending on the individual program of study.

Competencies in biostatistics are divided into three Core areas:

  1. Theory and Methods of Inference (Core A)
  2. Methods of Biostatistical Analysis (Core B)
  3. Advanced Training (Core C)

Biostatistics Core A: Theory and Methods of Inference

Statistical Inference I (PHP2520)
Linear Models (PHP2601)
Statistical Inference II (PHP2580)
Bayesian Inference (PHP2530)
Generalized Linear Models (PHP2605)
Practical Data Analysis (PHP2550)

Biostatistics Core B: Methods of Biostatistical Analysis

Analysis of Lifetime Data (PHP2602)
Causal Inference and Missing Data (PHP2610)

Biostatistics Core C: Advanced Training Electives in Statistical Methodology

Clinical Trials (PHP2030)
Statistical Methods for Bioinformatics (PHP2620)
Analysis of Longitudinal Data (PHP2603)
Statistical Methods for Spatial Data (PHP2604)
Advanced Topics in Biostatistics (PHP2690)
Qualifying courses in other departments (APMA, ECON, CS), with approval from Graduate Director

Other Requirements

Introduction to Methods in Epidemiologic Research (PHP2120)
Course in substantive field of application
Journal Club
Online Public Health Overview Instruction (PHP 101)
School of Public Health Responsible Conduct in Research Training - 1st Semester

The Department of Biostatistics offers the following courses:

(See "Program Requirements" drop down for eligible courses for your degree/track)

Updated Fall 2018

Course # Course Name
PHP 0100 Statistics Everywhere (Undergraduate Course)
  Freshman Seminar: Statistics is the universal language behind data-enabled decision making. Examples include Google's page ranking, Amazon's customer recommendations, weather prediction, medical care and political campaign strategy. This seminar will expose students to a variety of problems encountered in the media, in science and in life for which solutions require analysis of and drawing inferences from data. We will introduce basic concepts such as randomness, probability, variation, statistical significance, accuracy, bias and precision. The course will discuss statistical problems from reading assignments and material identified by the students. We will use simulation to illustrate basic concepts, though previous programming experience is not required.
PHP 1501 Essentials of Data Analysis (Undergraduate Course)
  This course covers the basic concepts of statistics and the statistical methods commonly used in the social sciences and public health with an emphasis on applications to real data. The first half of the course introduces descriptive statistics and the inferential statistical methods of confidence intervals and significance tests. The second half introduces bivariate and multivariate methods, emphasizing contingency table analysis, regression, and analysis of variance. This is designed to be a first course in Statistics. The course is intended for Public Health or Statistics concentrators. Others can register with instructor's permission. There are no prerequisites.
PHP 2030 Clinical Trial Methodology
  We will examine the modern clinical trial as a methodology for evaluating interventions related to treatment, rehabilitation, prevention and diagnosis. Topics include the history and rationale for clinical trials, ethical issues, study design, protocol development, sample size considerations, quality assurance, statistical analysis, systematic reviews and meta-analysis, and reporting of results. Extensively illustrated with examples from various fields of health care research.
PHP 2507 Biostatistics and Applied Data Analysis II
  The objective of the year long, two-course sequence is for students to develop the knowledge, skills and perspectives necessary to analyze data in order to answer a public health questions. The year long sequence will focus on statistical principles as well as the applied skills necessary to answer public health questions using data, including: data acquisition, data analysis, data interpretation and the presentation of results. Through lectures, labs and small group discussions, this fall semester course will focus on identifying public health data sets, refining research questions, univariate and bivariate analyses and presentation of initial results. Prerequisite: understanding of basic math concepts and terms; basic functional knowledge of Stata. Enrollment limited to 50 MPH and CTR students. Instructor permission required.
PHP 2508 Biostatistics and Applied Data Analysis II
  Biostatistics and Applied Data Analysis II is the second course in a year-long, two-course sequence designed to develop the skills and knowledge to use data to address public health questions. The courses are specifically for students in the Brown MPH program, and the training programs in Clinical and Translational Research. The sequence is completed in one academic year, not split across two years. The courses focus on statistical principles as well as the applied skills necessary to answer public health questions using data, including: acquisition, analysis, interpretation and presentation of results. Prerequisite: PHP 2507. Enrollment limited to 48. Instructor permission required.
PHP 2510 Principles of Biostats & Data Analysis
  Intensive first course in biostatistical methodology, focusing on problems arising in public health, life sciences, and biomedical disciplines. Summarizing and representing data; basic probability; fundamentals of inference; hypothesis testing; likelihood methods. Inference for means and proportions; linear regression and analysis of variance; basics of experimental design; nonparametrics; logistic regression.
PHP 2511 Applied Regression Analysis
  Applied multivariate statistics, presenting a unified treatment of modern regression models for discrete and continuous data. Topics include multiple linear and nonlinear regression for continuous response data, analysis of variance and covariance, logistic regression, Poisson regression, and Cox regression.
PHP 2514 Applied Generalized Linear Models
  This course provides a survey of generalized linear models (GLMs) for outcomes including continuous, binary, count, survival and correlated data. This course will work through the basic theories of GLMs. Emphasis will be on understanding the implications of this theory and the applications to solving real data problems. Extensive use of computer programming will be required to analyze the data in this class. This course is designed for graduate and advanced undergraduate students who will be analyzing data and want to develop a practical hands on toolkit as well as understanding of the theoretical underpinnings of regression.
PHP 2515 Fundamentals of Probability & Statistical Inference
  This course will provide an introduction to probability theory, mathematical statistics and their application to biostatistics. The emphasis of the course will be on basic mathematical and probabilistic concepts that form the basis for statistical inference. The course will cover fundamental ideas of probability, some simple statistical models (normal, binomial, exponential and Poisson), sample and population moments, nite and approximate sampling distributions, point and interval estimation, and hypothesis testing. Examples of their use in modeling will also be discussed.
PHP 2516 Applied Longitudinal Data Analysis
  This course provides a survey of longitudinal data analysis. Topics will range from exploratory analysis, study design considerations, GLM for longitudinal data, covariance structures, generalized linear models for longitudinal data, marginal models and mixed effects. Data and examples will come from medical/pharmaceutical applications, public health and social sciences.
PHP 2517 Applied Multilevel Data Analysis
  This course provides a survey of multilevel data analysis. Topics will range from structure of multilevel data, basic multilevel linear models, multilevel GLM, Model testing and evalatuation and missing data imputation. Data and examples will be drawn from medical, public health and social sciences. Students will be using real data throughout this course.
PHP 2520 Statistical Inference I
  First of two courses that provide a comprehensive introduction to the theory of modern statistical inference. PHP 2520 presents a survey of fundamental ideas and methods, including sufficiency, likelihood based inference, hypothesis testing, asymptotic theory, and Bayesian inference. Measure theory not required.
PHP 2530 Bayesian Inference
  Surveys the state of the art in Bayesian methods and their applications. Discussion of the fundamentals followed by more advanced topics including hierarchical models, Markov Chain Monte Carlo, and other methods for sampling from the posterior distribution, robustness, and sensitivity analysis, and approaches to model selection and diagnostics. Features nontrivial applications of Bayesian methods from diverse scientific fields, with emphasis on biomedical research.
PHP 2550 Practical Data Analysis
  Covers practical skills required for successful analysis of scientific data including statistical programming, data management, exploratory data analysis, simulation and model building and checking. Tools will be developed through a series of case studies based on different types of data requiring a variety of statistical methods. Modern regression techniques such as cross-validation, bootstrapping, splines and bias-variance tradeoff will be emphasized. Students should be familiar with statistical inference as well as regression analysis. The course will use the R programming language.
PHP 2560 Statistical Programming with R
  Statistical computing is an essential part of analysis. Statisticians need not only be able to run existing computer software but understand how that software functions. Students will learn fundamental concepts – Data Management, Data types, Data cleaning and manipulation, databases, graphics, functions, loops, simulation and Markov Chain Monte Carlo through working with various statistical analysis. Students will learn to write code in an organized fashion with comments. This course will be taught using both R and Julia languages in a flipped format.
PHP 2561 Methods in Informatics and Data Science for Health
  This course will teach informatics and data science skills needed for research in public health and biomedicine. Particular emphasis will be given to formalisms and algorithms used within the context of biomedical research and health care, including those used in biomolecular sequence analysis, electronic health records, clinical decision support, and public health surveillance. General programming language skills will be taught (in Julia) within these contexts. Mastery of informatics and data science skills will be assessed by a final project done within a health or biomedical context.
PHP 2570 Health Data Science
  This course is designed to introduce students to the practice of data science in health related fields via presentation and in-depth discussion of case studies of current or recently completed projects. The case studies will be selected to highlight important areas of research and health policy analysis. It is intended for students with advanced training in data science methods and computing at the level of courses offered in the Biostatistics Masters Program.
PHP 2580 Statistical Inference II
  This sequence of two courses provides a comprehensive introduction to the theory of modern inference. PHP 2580 covers such topics as non-parametric statistics, quasi-likelihood, resampling techniques, statistical learning, and methods for high-dimensional Bioinformatics data.
PHP 2601 Linear Models
  This course will focus on the theory and applications of linear models for continuous responses. Linear models deal with continuously distributed outcomes and assume that the outcomes are linear combinations of observed predictor variables and unknown parameters, to which independently distributed errors are added. Topics include matrix algebra, multivariate normal theory, estimation and inference for linear models, and model diagnostics.
PHP 2602 Analysis of Lifetime Data
  Comprehensive overview of methods for inference from censored event time data, with emphasis on nonparametric and semiparametric approaches. Topics include nonparametric hazard estimation, semiparametric proportional hazards models, frailty models, multiple event processes, with application to biomedical and public health data. Computational approaches using statistical software are emphasized.
PHP 2605 Generalized Linear Models
  This course will focus on the theory and application of generalized linear models (GLM), a unified statistical framework for regression analyses. Specifically, we will focus on using GLMs to model the categorical outcomes. The GLM for categorical outcomes include logistic regression, proportional odds model, and Poisson regression. Maximum likelihood estimation and inference will be introduced in the GLM context.
PHP 2610 Causal Inference & Missing Data
  Systematic overview of modern statistical methods for handling incomplete data and for drawing causal inferences from "broken experiments" and observational studies. Topics include modeling approaches, propensity score adjustment, instrumental variables, inverse weighting methods and sensitivity analysis. Case studies used throughout to illustrate ideas and concepts.
PHP 2620 Statistical Methods in Bioinformatics
  Introduction to statistical concepts and methods used in selected areas of bioinformatics. Organized in three modules, covering statistical methodology for: (a) analysis of microarray data, with emphasis on application in gene expression experiments, (b) proteomics studies, (c) analysis of biological sequences. Brief review and succinct discussion of biological subject matter will be provided for each area.
PHP 2650 Statistical Learning/Big Data
  This course introduces modern statistical tools to analyze big data, including three interconnected components: computing tools, statistical machine learning, and scalable algorithms. It introduces the principal techniques: extract and organize data from complex sources, explore patterns, frame statistical problems, build computational algorithms, and disseminate reproducible research. Topics include web data extraction, database management, exploratory data analysis, dimension reduction, convex optimization algorithms, high-dimensional linear/nonlinear models, tree/ensemble methods, and predictive modeling. These techniques are illustrated using big data examples from many scientific disciplines.
DATA 2020 Probability, Statistics & Machine Learning
  This course is provided for the Data Science Initiative: Includes topics in statistical learning including regression, classification, model selection, and causal inference.

 

All course offerings are subject to change. Consult Banner for the most up-to-date schedule. The University Bulletin also contains a comprehensive list of all Public Health courses.

In response to the National Institutes of Health (NIH) notice NOT-OD-13-093 and the Brown University School of Public Health mandate regarding the use of Individual Development Plans (IDP), all students in the Department of Biostatistics, regardless of funding sources, are required to complete and submit, in consultation with their advisor, and IDP. Specifically:

  • Incoming, matriculating students must complete an IDP, in consultation with their advisor, by the beginning of their second semester.  
  • All students must submit an updated IDP, in consultation with their advisor, on an annual basis.  

The IDP is a valuable tool that gives students the opportunity to consider and address their short-term and long-term career goals.  In order to achieve compliance with the IDP policy, please fill out the Individual Development Plan for Biostatistics, discuss with your advisor, and submit your completed form.  

Note: New students will be provided their login credentials following orientation.