Jong-Hyeon Jeong Professor Biostatistics, University of Pittsburgh
Dr. Jeong is professor and interim chair of biostatistics at the University of Pittsburgh. He has made significant contributions to biostatistical education, research, and leadership in the statistics community in Pittsburgh and at large. He graduated 6 MS and 11 PhDs and is currently working with two PhD students. He also played a critical role in the growth, development, and modernization of the graduate programs at Pitt Biostatistics. His efforts have led to two new concentrations in MS program, Health Data Science, and Statistical and Computational genomics. Dr. Jeong maintains outstanding research programs in time-to-event data analysis, clinical trials, and statistical learning for precision medicine. He single-authored or co-authored three books and currently serves on the editorial board for Lifetime Data Analysis. Dr. Jeong is also active in our profession. He has served on the ASA career development committee, board of director for Korean International Statistical society, and as interim chair has supported ENAR’s coalition for junior researchers and various activities of the ASA Pittsburgh chapter. Â
The 2021 ASA Pittsburgh Chapter Banquet is a virtual event on Friday, April 23. The banquet features a keynote speech by Dr. Bin Yu, Berkeley Statistics, Electrical Engineering & Computer Sciences; a student poster session and presentation of student awards; various recognition of Chapter member’s awards and honors; and the presentation of the 2021 Statistician of the Year.
Bin Yu Chancellor’s Distinguished Professor Class of 1936 Second Chair Department of Statistics Electrical Engineering and Computer Sciences UC Berkeley
Talk Title: Veridical Data Science for biomedical discovery: detecting epistatic interactions with epiTree.
Abstract:
“A.I. is like nuclear energy — both promising and dangerous” — Bill Gates, 2019.
Data Science is a pillar of A.I. and has driven most of recent cutting-edge discoveries in biomedical research. In practice, Data Science has a life cycle (DSLC) that includes problem formulation, data collection, data cleaning, modeling, result interpretation and the drawing of conclusions. Human judgment calls are ubiquitous at every step of this process, e.g., in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the “dangers” of A.I. To maximally mitigate these dangers, we developed a framework based on three core principles: Predictability, Computability and Stability (PCS). Through a workflow and documentation (in R Markdown or Jupyter Notebook) that allows one to manage the whole DSLC, the PCS framework unifies, streamlines and expands on the best practices of machine learning and statistics – bringing us a step forward towards veridical Data Science.
In this lecture, we will illustrate the PCS framework through the epiTree; a pipeline to discover epistasis interactions from genomics data. epiTree addresses issues of scaling of penetrance through decision trees, significance calling through PCS p-values, and combinatorial search over interactions through iterative random forests (which is a special case of PCS). Using UK Biobank data, we validate the epiTree pipeline through an application to the red-hair phenotype, where several genes are known to display epistatic interactions.
Biography
Bin Yu is Chancellor’s Distinguished Professor and Class of 1936 Second Chair in the departments of statistics and EECS at UC Berkeley. She leads the Yu Group which consists of 15-20 students and postdocs from Statistics and EECS. She was formally trained as a statistician, but her research extends beyond the realm of statistics. Together with her group, her work has leveraged new computational developments to solve important scientific problems by combining novel statistical machine learning approaches with the domain expertise of her many collaborators in neuroscience, genomics and precision medicine.She and her team develop relevant theory to understand random forests and deep learning for insight into and guidance for practice.
She is a member of the U.S. National Academy of Sciences and of the American Academy of Arts and Sciences. She is Past President of the Institute of Mathematical Statistics (IMS), Guggenheim Fellow, Tukey Memorial Lecturer of the Bernoulli Society, Rietz Lecturer of IMS, and a COPSS E. L. Scott prize winner.
She is serving on the editorial board of Proceedings of National Academy of Sciences (PNAS) and the scientific advisory committee of the UK Turing Institute for Data Science and AI.