Dear Colleagues,
The Section of Statistical Learning and Data Science is happy to present its December webinar by Prof. Weijie Su from Upenn. This is the last webinar in 2023. Thank you all for the support for SLDS webinar series! Happy holidays!
Title: Can Statistics Save Machine Learning from a Crisis? A Regression Approach to Peer Review in NeurIPS/ICML
Speakers: Prof. Weijie Su, Wharton Statistics and Data Science Department, University of Pennsylvania
Date and Time: December 15, 2023, 2:30 to 4:00 pm Eastern Time
Registration Link: ASA SLDS Webinar Registration Link [eventbrite.com]
Abstract: In 2014, the number of submissions to NeurIPS was 1,678, but this number skyrocketed to 10,411 in 2022, putting a huge strain on the peer review process. In this talk, we attempt to address this challenge starting by considering the following scenario: Alice submits a large number of papers to a machine learning conference and knows about the ground-truth quality of her papers; Given noisy ratings provided by independent reviewers, can Bob obtain accurate estimates of the ground-truth quality of the papers by asking Alice a question about the ground truth? First, if Alice would truthfully answer the question because by doing so her payoff as additive convex utility over all her papers is maximized, we show that the questions must be formulated as pairwise comparisons between her papers. Moreover, if Alice is required to provide a ranking of her papers, which is the most fine-grained question via pairwise comparisons, we prove that she would be truth-telling. By incorporating the ground-truth ranking, we show that Bob can obtain an estimator with the optimal squared error in certain regimes based on any possible ways of truthful information elicitation. Moreover, the estimated ratings are substantially more accurate than the raw ratings when the number of papers is large and the raw ratings are very noisy. Finally, we conclude the talk with an experiment of this scoring mechanism in ICML 2023. This is based on arXiv:2110.14802, arXiv:2206.08149, and arXiv:2304.11160.
Presenter: Weijie Su is an Associate Professor at the University of Pennsylvania, with an appointment in the Wharton Statistics and Data Science Department, where he is a co-director of the Penn Research in Machine Learning Center. Prior to joining Penn, he received his Ph.D. in statistics from Stanford University in 2016 under the supervision of Emmanuel Candes and his bachelor's degree from Peking University in 2011. His research interests span privacy-preserving data analysis, deep learning theory, statistical aspects of large language models, and high-dimensional statistics. He serves as an associate editor of the Journal of the American Statistical Association (Theory and Methods). He is a recipient of the Stanford Theodore Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, a Sloan Research Fellowship in 2020, the IMS Peter Gavin Hall Prize in 2022, the SIAM Early Career Prize in Data Science in 2022, and the ASA Gottfried Noether Early Career Award in 2023.
------------------------------
Zhihua Su, PhD
Associate Professor
Department of Statistics
University of Florida
zhihuasu@stat.ufl.edu------------------------------