Health Data Science: State of the Art and a Look Into the Future
A Symposium celebrating 30 Years of Biostatistics at Brown University
Health Data Science: State of the Art and a Look Into the Future
A Symposium celebrating 30 Years of Biostatistics at Brown University
October 18th, 2024 | 8:30am- Welcome and refreshments
RSVP Now
Interested in attending our celebration for 30 Years of Biostatistics at Brown University? Please RSVP by October 10, 2024!
Agenda
8:30am
Welcome and refreshments
9:15am
Introduction by Joseph Hogan, Chair of Biostatistics
9:30am
Session 1: Clinical Evaluation of AI, future of Clinical trial design
- Jon Steingrimsson, Brown University
- Sumithra Mandrekar, Mayo Clinic
Title: Design and Conduct of Cancer Clinical Trials: 2024 and Beyond - Jean Feng, University of California San Francisco
- Discussant: Constantine Gatsonis, Brown University
10:45am
Coffee break
11:00am
Session 2: Analysis of large observational data
- Liz Stuart, Johns Hopkins Bloomberg School of Public Health
Title: Integrating Data for Causal Inference - Mike Daniels, University of Florida
Title: A Bayesian Nonparametric Approach For Causal Inference In EHR Data In The Presence Of Nonignorable Missingness - Youjin Lee, Brown University
- Discussant: Rebecca Hubbard, Brown University
12:15pm
Lunch break
1:30pm
Session 3: Statistical inference of Massive Data
- Yi Zhao, Indiana University School of Medicine
Title: Beyond Massive Univariate Tests: Covariance Regression Reveals Complex Patterns of Brain Functional Connectivity - Ying Ma, Brown University
- Lorin Crawford, Microsoft Research and Brown University
- Discussant: Jean Wu, Brown University
2:55pm
Break
3:10pm
Panel Discussion: Biostatistics research and education as essential academic units in the era of HDS and AI
Moderated by Joseph Hogan:
- Francesca Dominici, Harvard University
- Kiros Berhane, Columbia University
- Alice Paul, Brown University
- Xihong Lin, Harvard University
4:10pm
Closing Remarks
Poster session highlighting the work of our faculty, alumni and students
Drinks and appetizers served
Abstract and Bios
Sumithra J. Mandrekar, Ph.D., is currently Professor of Biostatistics and Oncology at the Mayo Clinic, Rochester MN, and the Group Statistician and Program Director for the Statistics and Data Management Center for the Alliance for Clinical Trials in Oncology. Alliance is one of the 4 NCI- funded national clinical trials networks for the conduct of phase II and III clinical trials in adult cancer. She is widely recognized for significant contributions to the statistical methodology for the design, conduct and analysis of clinical trials, particularly in oncology; for leadership in clinical trials and data management coordination at Mayo Clinic and the Alliance for Clinical Trials in Oncology; for leadership on national and international steering committees and advisory panels related to cancer, including the National Cancer Institute Clinical and Translational Advisory Committee (CTAC). She is a fellow and past president of the Society for Clinical Trials. Dr. Mandrekar’s primary research interests include adaptive dose-finding early phase trial designs, trial designs in the late phase and marker validation setting, and general clinical trial methodology related to streamlining the conduct of clinical trials and identification of alternative cancer clinical trial endpoints.
Title: Design and Conduct of Cancer Clinical Trials: 2024 and Beyond
Elizabeth A. Stuart, Ph.D., is the Frank Hurley and Catharine Dorrier Chair and Bloomberg Professor of American Health in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health, with joint appointments in the Department of Mental Health and the Department of Health Policy and Management. She was previously Executive Vice Dean for Academic Affairs at the School. She received her PhD in Statistics from Harvard University in 2004. Her research interests are in design and analysis approaches for estimating causal effects in experimental and non-experimental studies, including questions around the external validity of randomized trials and the internal validity of non-experimental studies, as well as methods for combining data sources to assess treatment effect heterogeneity and methods for evidence synthesis. She has published over 350 papers and has received research funding for her work from the National Science Foundation, the Institute of Education Sciences, the WT Grant Foundation, and the National Institutes of Health and has served on advisory panels for the US Department of Education, and the Patient Centered Outcomes Research Institute. She is a fellow of the American Statistical Association and the American Association for the Advancement of Science, received the mid-career award from the Health Policy Statistics Section of the ASA, the Gertrude Cox Award for applied statistics, Harvard University’s Myrto Lefkopoulou Award for excellence in Biostatistics, and the Society for Epidemiologic Research Marshall Joffe Epidemiologic Methods award. She currently serves on the National Academies of Sciences, Engineering, and Medicine (NASEM) Committee on National Statistics and co-chairs NASEM's Committee on Applied and Theoretical Statistics.
Title: Integrating Data for Causal Inference
Abstract
Many causal questions of interest cannot be answered through analysis of a single dataset, and as data becomes increasingly available, there is more and more interest in leveraging that data to answer nuanced questions. Such questions might include examining the generalizability of randomized trial results to target populations, to better understanding of effect heterogeneity by combining small (unbiased) randomized trials with large (but confounded) non-experimental data sources. This talk will discuss methods for causal inference in such integrated datasets, including both the promise and potential for doing so, as well as implementation challenges, such as when the measures in the different data sources are discordant. Motivating examples will come from medicine and public health, and with lessons for a range of fields, and with final comments on the broader field of evidence synthesis for causal inference.
Yi Zhao, Ph.D., is an Associate Professor, Siu L. Hui Scholar of Biostatistics, in the Department of Biostatistics and Health Data Science at Indiana University School of Medicine. Her study interest includes causal mediation analysis, decomposition methods, multiview data integration, density regression, and neuroimaging applications.
Title: Beyond Massive Univariate Tests: Covariance Regression Reveals Complex Patterns of Brain Functional Connectivity
Abstract
Studies of brain functional connectivity typically involve massive univariate tests, performing statistical analysis on each individual connection. In this study, we consider the problem of regressing covariance matrices on associated covariates. The goal is to use covariates to explain variation in covariance matrices across units. As such, we introduce Covariate Assisted Principal (CAP) regression, an optimization-based method for identifying components associated with the covariates using a generalized linear model approach. For high-dimensional data, a well-conditioned linear shrinkage estimator of the covariance matrix is introduced. With multiple covariance matrices, the shrinkage coefficients are proposed to be common across matrices. Theoretical studies demonstrate that the proposed covariance matrix estimator is optimal achieving the uniformly minimum quadratic loss asymptotically among all linear combinations of the identity matrix and the sample covariance matrix. Under regularity conditions, the proposed estimator of the model parameters is consistent. We develop computationally efficient algorithms to jointly search for common linear projections of the covariance matrices, as well as the regression coefficients. The superior performance of the proposed approach over existing methods is illustrated through simulation studies. Applied to resting-state functional magnetic resonance imaging (fMRI) studies, the proposed approach regresses whole-brain functional connectivity on covariates and enables the identification of relevant brain subnetworks.
Mike Daniels, Ph.D., received his undergraduate degree from Brown University in Applied Math and doctoral degree from Harvard University in Biostatistics. He has been on the faculty at Iowa State and University of Texas at Austin. Currently, Daniels is Professor, Andrew Banks Family Endowed Chair, and Chair in the Department of Statistics at the University of Florida. He is a past president of ENAR. He is a fellow of the American Statistical Association, former chair of the Statistics in Epidemiology Section of the American Statistical Association (ASA), former chair of the Biometrics Section of the ASA, and former editor of Biometrics. He has received the Lagakos Distinguished Alumni Award from Harvard Biostatistics and the L. Adrienne Cupples Award from Boston University. He has published extensively on Bayesian methods for missing data, longitudinal data and causal inference and has been funded by NIH R01 grants from as PI and/or MPI since 2001. He also has a strong and productive record of collaborative research, with a focus on behavioral trials in smoking cessation and weight management, muscular dystrophy, and HIV.
Title: A Bayesian Nonparametric Approach For Causal Inference In EHR Data In The Presence Of Nonignorable Missingness
Abstract
We propose an approach for missingness in EHRs using Bayesian nonparametric (BNP) models. We show how to introduce sensitivity parameters corresponding to nonignorable missingness in the outcome and confounders by extracting unidentified distributions from the BNP model and reconstructing the distribution of interest. We also flexibly include auxiliary covariates to move closer to MAR. We use G-computation based on the reconstructed distribution to compute causal estimands of interest. We use our approach to assess the comparative effectiveness of two bariatic surgeries on BMI 18 months after surgery.
Joint work with David Lindberg (University of Florida) and Sebastien Haneuse (Harvard University)
Jean Feng, Ph.D. is an Assistant Professor in the Department of Epidemiology and Biostatistics at the University of California, San Francisco and the UCSF-UC Berkeley Joint Program in Computational Precision Health and a principal investigator at the UCSF-Stanford Center of Excellence in Regulatory Science and Innovation (CERSI). She is also the data science lead on the predictive analytics team for the Zuckerberg San Francisco General Hospital. Her research interests span the interpretability, reliability, and regulation of machine learning (ML) algorithms in healthcare.
Francesca Dominici, Ph.D., is the Clarence James Gamble Professor of Biostatistics, Population and Data Science at the Harvard T.H. Chan School of Public Health and Director of the Harvard Data Science Initiative at Harvard University. She is a member of the National Academy of Medicine and of the International Society of Mathematical Statistics. In 2024, she was named by TIME100 Health as one of the most influential scientists in global health in the world. Before being appointed founding Director of the Harvard Data Science Initiative, she was Senior Associate Dean for Research at the Harvard TH Chan School of Public Health.
Dominici’s research has focused on machine learning, Artificial Intelligence, causal inference, and data science to impact climate and environmental policy. Her air pollution studies have directly and routinely impacted air quality policy, leading to more stringent ambient air quality standards in the U.S. Her work has been covered by the New York Times, the Los Angeles Times, the BBC, the Guardian, CNN, and NPR.
Dominici is an advocate for the career advancement of women faculty. Her work on the Johns Hopkins University Committee on the Status of Women earned her the campus Diversity Recognition Award in 2009. At the T.H. Chan School of Public Health, she has led the Committee for the Advancement of Women Faculty.
Kiros Berhane, Ph.D., is the Cynthia and Robert Citrone-Roslyn and Leslie Goldstein Professor and Chair of the Department of Biostatistics at Columbia University. His expertise is in development of methods for complex data structures on multi-factorial health effects. He is Contact PI of the U2R component of the GEOHealth Hub for Eastern Africa focusing on health impacts of environmental hazards and climate change. He is a well-funded researcher in statistical methodology development and their application to a wide array of domain areas of public health as well as training programs that include the “Advancing Public Health Research in Eastern Africa through Data Science Training (APHREA-DST)” to develop new graduate training programs in public health data science at University of Nairobi (Kenya) and Addis Ababa University (Ethiopia) – as part of NIH’s DS-I Africa initiative. He recently served as a member of the committee of the National Academy of Science, Engineering and Medicine (NASEM) on Assessing Causality from a Multidisciplinary Evidence Base for National Ambient Air Quality Standards and also as a member of the core panel for Lancet Commission on the Future of Health and Economic Resilience of Africa (FHERA). He serves on the editorial boards of several scientific journals, and he is currently serving as a member of Science magazine’s Board of Reviewing Editors. He was a Fulbright Scholar in 2016-2017. He is an elected fellow of the American Statistical Association.
Xihong Lin