Program Structure

The program is conducted over one academic year plus one summer. The regular program includes two semesters of coursework and a 5-10 week capstone project focused on data analysis in a particular application area. Pre-program summer classes are available for students who lack one or more of the basic prerequisites.

Nine credit-units are required to pass the program: four in each of the academic year semesters, and one (the capstone experience) in the summer:

  • 3 credits in mathematical and statistical foundations
  • 3 credits in data and computational science
  • 1 credit in societal implications and opportunities
  • elective credit to be drawn from a wide range of focused applications or deeper theoretical exploration
  • 1 credit capstone experience

Course Descriptions

DATA 1010: An Introduction to Topics in Probability, Statistics, and Machine Learning (Fall, 2 credits)

An introduction to the mathematical methods of data science through a combination of computational exploration, visualization, and theory. Students will learn programming basics, topics in numerical linear algebra and scientific computing, mathematical probability (probability spaces, expectation, conditioning, common distributions, law of large numbers and the central limit theorem), statistics (point estimation, confidence intervals, hypothesis testing, maximum likelihood estimation, density estimation, bootstrapping, and cross-validation), and machine learning (regression, classification, and dimensionality reduction, including Gaussian models, decision trees, neural networks, Bayesian networks, and principal component analysis). 

DATA 1030: An Introduction to Data and Computational Science (Fall, 2 credits) 

This class gives students hands-on experience with some of the applied statistical concepts and software tools that are essential for modern data science, including gathering data, data wrangling, exploratory data analysis, and machine learning. The course also explores algorithms and data structures, as well as different systems that organize data for efficient storage and computation. 

DATA 2020Probability, Statistics and Machine Learning: Advanced Methods (Spring, 1 credit) 

This course provides a modern introduction to methods for regression analysis and statistical learning, with an emphasis on application of the methods in practical settings. Regression methods are developed in the context of learning relationships from observed data. Methods include basics of linear regression, variable selection and dimension reduction, approaches to nonlinear regression, and dealing with different types of data.

DATA 2040: Data and Computational Science (Spring, 1 credit) 

Advanced Methods: Includes topics such as data mining; computational statistics; machine learning and predictive modeling; big data analytics algorithms. 

DATA 2080Data and Society (Spring, 1 credit) 

A uniquely Brown course involving case studies that will cover topics such as the broader implications in policy and ethics of the data revolution. 

DATA 2050Capstone Project (Summer, 1 credit) 

Students will work on a project with real data, potentially in any one of the areas covered by the elective course. A faculty member from one of the four departments will oversee the capstone course, although each student may collaborate with an additional faculty member, postdoc, or industry partner on his or her project. Each student will prepare a paper and/or oral presentation of his or her work. The summer capstone should entail at least 180 hours of work (to receive one course credit) and as such, may be completed in 5-10 weeks. The project may begin and end at any time during the summer. A letter grade will be awarded for the summer capstone course.