Data Science/Big Data Sessions at the 2016 Joint Statistical Meetings

By Steve Pierson posted 04-27-2016 21:18

  

Following up on my 20132014, and 2015 blog entries on the same topic, here are the 2015 sessions that mention "Big Data" or "Data Science" in the title or the abstract, or seems strongly related to the topic. See also,

The first list is based on session titles that contain either term (using Basic Search on the JSM 2016 Online Program.) The second list is where "Data Science" appears in the abstract of a talk or session (using Abstract Keyword Search). The third list is where "Big Data" appears in the abstract of a talk or session (using Abstract Keyword Search). Please let me know of any I missed or add them through the comment section below. 

 

Sessions or Continuing Ed Courses with Big Data or Data Science in the Title: 

6 Open Source Statistical Software for Data Science — Invited Papers Sun, 7/31/2016, 2:00 PM - 3:50 PM CC-W184bc
20 Ranking Methods: Infinite Permutations, Statistical Physics, Copulas, and Seriation for Discovery of Subpopulations in Big Data — Topic Contributed Papers Sun, 7/31/2016, 2:00 PM - 3:50 PM CC-W175c
149 Recent Advances and Challenges of Big Data Inference with Complex Structures — Invited Papers Mon, 8/1/2016, 10:30 AM - 12:20 PM CC-W190a
221 The ASA Journal of Data Science: A Showcase — Invited Papers Mon, 8/1/2016, 2:00 PM - 3:50 PM CC-W181a
259 Nonparametric Methods for "BIg Data" — Contributed Papers Mon, 8/1/2016, 2:00 PM - 3:50 PM CC-W186c
279 Introductory Overview Lecture: Data Science — Invited Special Presentation Tue, 8/2/2016, 8:30 AM - 10:20 AM CC-W375b
287 Economic and Business Applications in High Dimensional and Big Data contexts — Invited Papers Tue, 8/2/2016, 8:30 AM - 10:20 AM CC-W184bc
290 Bridging BFF (Bayesian/frequentist/fiducial) inferences in the era of data science (No. 1) — Invited Papers Tue, 8/2/2016, 8:30 AM - 10:20 AM CC-W178a
300 Statistical Challenges in Big Data, Finance and Business Analytics — Topic Contributed Papers Tue, 8/2/2016, 8:30 AM - 10:20 AM CC-W184d
338 Big data challenges and statistical advances in functional genomics — Topic Contributed Papers Tue, 8/2/2016, 10:30 AM - 12:20 PM CC-W195
408 Bridging BFF (Bayesian/frequentist/fiducial) inferences in the era of data science (No.2) — Invited Papers Tue, 8/2/2016, 2:00 PM - 3:50 PM CC-W375b
465 Data Science for Health Policy: A Broad Tent — Invited Papers Wed, 8/3/2016, 8:30 AM - 10:20 AM CC-W192a
475 Reproducibility in Statistics and Data Science — Invited Papers Wed, 8/3/2016, 8:30 AM - 10:20 AM CC-W196c
522 Data Science Education — Invited Panel Wed, 8/3/2016, 10:30 AM - 12:20 PM CC-W190a
528 The NSF/NIH/SAMSI Workshop on Interdisciplinary Approaches to Biomedical Data Science Challenges — Topic Contributed Papers Wed, 8/3/2016, 10:30 AM - 12:20 PM CC-W183b
536 Ethics in Data Science for Statistical Consultants — Topic Contributed Panel Wed, 8/3/2016, 10:30 AM - 12:20 PM CC-W192b
579 Challenges and Opportunities for Analysis of High-Dimensional and Big Data — Invited Papers Wed, 8/3/2016, 2:00 PM - 3:50 PM CC-W183a
634 Analysis, Storage, and Privacy for Big Data — Invited Papers Thu, 8/4/2016, 8:30 AM - 10:20 AM CC-W179a
652 Big Data and Data Science Education — Contributed Papers Thu, 8/4/2016, 8:30 AM - 10:20 AM CC-W192b
678 Strategies for Developing Undergraduate Data Science Programs — Invited Papers Thu, 8/4/2016, 10:30 AM - 12:20 PM CC-W192c
688 Personalized intervention based on health care big data research — Topic Contributed Papers Thu, 8/4/2016, 10:30 AM - 12:20 PM CC-W193a
691 Modern Biosurveillance at the Edge of Online Social Media, Social Networks and Nontraditional Big Data — Topic Contributed Papers Thu, 8/4/2016, 10:30 AM - 12:20 PM CC-W181a

 

Presentation Abstracts with "Data Science" in the Abstract: There are 33 of these and probably include talks from the session listed above (and talks below

Sunday, 07/31/2016
Fighting fraud with statistics! 
Alyssa Frazee, Stripe 


The future of the journal Biostatistics 
Dimitris Rizopoulos, Erasmus University Medical Center; Jeffrey Leek, Johns Hopkins Bloomberg School of Public Health 


We Are What We Ask: Mapping the Ecosystem of Software Development Using Stack Overflow Data 
David G Robinson, Princeton University 


The Python Data Science Stack 
Jake VanderPlas, University of Washington 
2:25 PM 

Thinking with data using R and RStudio: powerful idioms for analysts 
Nicholas Jon Horton, Amherst College; Randall Pruim, Calvin College; Daniel Kaplan, Macalester College 
4:05 PM 

Monday, 08/01/2016
From Statistician to Data Scientist: How to Prepare? 
Ming Li, REANCON 


What Can Statistics Learn from Machine Learning? And Vice Versa? 
Ryan Tibshirani, Carnegie Mellon University 


Are Volcanic Eruptions Increasing?: An Example of Teaching Data Wrangling and Visualization in Stat 2 
Kelly McConville, Swarthmore College 


Teaching data visualization to 100k data scientists - lessons from evidence based data analysis 
Jeffrey Leek, Johns Hopkins Bloomberg School of Public Health 
9:55 AM 

Are Volcanic Eruptions Increasing?: An Example of Teaching Data Wrangling and Visualization in Stat 2 
Kelly McConville, Swarthmore College 
12:05 PM 

Tuesday, 08/02/2016
Data Science: Bridging Academia and Industry 
Justin Dyer, Google, Inc.; Donal McMahon, Google, Inc. 


How NOT To Do A/B Testing 
David Charles Draper, University of California, Santa Cruz 
9:25 AM 

Computational Thinking and Statistical Thinking: Foundations of Data Science 
Ani Adhikari, University of California, Berkeley; Michael Jordan, University of California, Berkeley 
10:35 AM 

Dealing with credit data: the challenges and statistical solutions 
Giulianna Perrotti dos Reis, Credit Sesame 
11:00 AM 

Statistical Computing as an Introduction to Data Science 
Colin Rundel, Duke University 
11:00 AM 

The ASA DataFest: Learning by Doing 
Robert Gould, UCLA Dept. of Statistics 
11:25 AM 

Learning Communities: an Emerging Platform for Research in Statistics 
Mark Daniel Ward, Purdue University 
11:50 AM 

The professional melting pot: statisticians, data scientists, and health researchers talk shop to improve public well-being 
Nicholas Beyler, Mathematica Policy Research; Fei Xing, Mathematica Policy Research 
3:05 PM 

Wednesday, 08/03/2016
Members Choice: Hot Topics in Statistical Learning and Data Mining 
Glen Wright Colopy, University of Oxford 


Steps toward reproducible research 
Karl W Broman, University of Wisconsin-Madison 
8:55 AM 

Integrating reproducibility into the undergraduate statistics curriculum 
Mine Cetinkaya-Rundel, Duke University 
9:35 AM 

Data Science Education 
Constantine Gatsonis, Brown University; Alfred Hero, University of Michigan; Robert Kass, Carnegie Mellon University; John Lafferty, University of Chicago; Raghu Ramakrishnan, Microsoft Corporation 
10:35 AM 

Ethics in Data Science for Statistical Consultants 
Nik Andric, Deloitte Consulting; James Guszcza, Deloitte Consulting LLP; Andrej Zwitter, University of Groningen; Hope McIntyre, University of Virginia 
10:35 AM 

Carpe Datum! Bill Cleveland's contributions to data science and big data analysis 
Steve Scott, Google Analytics 
11:35 AM 

Thursday, 08/04/2016
Teaching Students to Work with Big Data through Visualizations 
Shonda Kuiper, Grinnell College 
8:35 AM 

A data visualization course for undergraduate data science students 
Silas Bergen, Winona State University 
8:50 AM 

Intro Stats for Future Data Scientists 
Brianna Heggeseth, Williams College; Richard De Veaux, Williams College 
9:05 AM 

An Undergraduate Data Science Program 
James Albert, Bowling Green State University; Maria Rizzo, Bowling Green State University 
9:20 AM 

Modernizing an Undergraduate Multivariate Statistics Class 
David Hitchcock, University of South Carolina; Xiaoyan Lin, University of South Carolina; Brian Habing, University of South Carolina 
9:35 AM 

Teaching Data Science Skills in an Introductory CS Course 
Olaf Hall-Holt, St. Olaf College 
10:35 AM 

Cross-Disciplinary Minor in Data Science: a new paradigm for partnership across disciplines 
Andrew Schaffner, California Polytechnic State University; Alexander Dekhtyar, California Polytechnic State University 
10:55 AM 

Detecting Text Reuse in State Legislative Bills 
Joe Walsh, University of Chicago; Matthew Burgess, University of Michigan; Eugenia Giraudy, YouGov; Julian Katz-Samuels, University of Michigan; Derek Willis, ProPublica; Rayid Ghani, Center for Data Science and Public Policy, University of Chicago 
10:55 AM 

Developing a Comprehensive Data Science Program 
Mark John Lancaster, Northern Kentucky University 
11:35 AM 

 

Presentation Abstracts with "Big Data" in the Abstract: There are 66 of these and probably include talks from the sessions and talks listed above

Sunday, 07/31/2016
Efficient analytical tools for columnar in-memory data 
Wes McKinney, Cloudera, Inc. 
2:05 PM 

Can Early Analysis Predict Alzheimer's Trial Success 
Kun Jin, HHS/FDA/CDER 
2:50 PM 

Big data regression and prediction for high-throughput genomic data 
Weiqiang Zhou, Johns Hopkins University Bloomberg School of Public Health; Ben Sherwood, Johns Hopkins University Bloomberg School of Public Health; Zhicheng Ji, Johns Hopkins Bloomberg School of Public Health; Fang Du, Johns Hopkins Bloomberg School of Public Health; Jiawei Bai, Johns Hopkins Bloomberg School of Public Health; Hongkai Ji, Johns Hopkins Bloomberg School of Public Health 
2:50 PM 

Thinking with data using R and RStudio: powerful idioms for analysts 
Nicholas Jon Horton, Amherst College; Randall Pruim, Calvin College; Daniel Kaplan, Macalester College 
4:05 PM 

Generalized full matching 
Fredrik Sävje, UC Berkeley; Michael Higgins, Kansas State University; Jasjeet Sekhon, University of California-Berkeley 
4:25 PM 

A Multi-Resolution Model for Activation and Connectivity in fMRI Data with Functional Estimation of the Haemodynamic Response 
Stefano Castruccio, Newcastle University; Hernando Ombao, University of California, Irvine; Thomas Theussl, King Abdullah University of Science and Technology; Marc Genton, KAUST 
4:55 PM 

Monday, 08/01/2016
From Statistician to Data Scientist: How to Prepare? 
Ming Li, REANCON 


Where do marine mammals go? Bayesian data fusion provides the answer 
Yang Seagle Liu, University of British Columbia; James V. Zidek, Department of Statistics, University of British Columbia; Brian C Battaile, Marine Mammal Research Unit, Institute for the Oceans and Fisheries, University of British; Andrew W. Trites, Marine Mammal Research Unit, Institute for the Oceans and Fisheries, University of British 


Pitch quantification in baseball: Reducing a pitch to a single number 
Jason Wilson, Biola University 


Incorporating Big Data into an Introductory Statistics Course 
Paul Stephenson, Grand Valley State University; Laura Kapitula, Grand Valley State University 


Social Signal Processing: Building Computational Models of Human Behavior in Digital Environments 
William Rand, University of Maryland, College Park; David Darmon, Uniform Services University for the Health Sciences; Michelle Girvan, University of Maryland 
8:35 AM 

The biglasso Package: Extending Lasso Model Fitting to Big Data in R 
Yaohui Zeng, The University of Iowa; Patrick Breheny, University of Iowa 
8:35 AM 

Where do marine mammals go? Bayesian data fusion provides the answer 
Yang Seagle Liu, University of British Columbia; James V. Zidek, Department of Statistics, University of British Columbia; Brian C Battaile, Marine Mammal Research Unit, Institute for the Oceans and Fisheries, University of British; Andrew W. Trites, Marine Mammal Research Unit, Institute for the Oceans and Fisheries, University of British 
9:30 AM 

Pitch quantification in baseball: Reducing a pitch to a single number 
Jason Wilson, Biola University 
10:35 AM 

Incorporating Big Data into an Introductory Statistics Course 
Paul Stephenson, Grand Valley State University; Laura Kapitula, Grand Valley State University 
10:50 AM 

Nonparametric Distributed Learning Architecture: Algorithm and Application 
Scott Bruce, Temple University; Zeda Li, Temple University; Hsiang-Chieh Yang, Temple University; Subhadeep Mukhopadhyay, Temple University , Fox School of Business 
11:20 AM 

Correcting Biases in Auxiliary Data to Produce Better Estimates 
Masahiko Aida, Civis Analytics 
11:50 AM 

Convergence and Stability Properties of Variance-Function Estimators Used in the Integration of Surveys and Alternative Data Sources 
John Eltinge, Bureau of Labor Statistics 
12:05 PM 

Big Data Algorithms for Rank-based Estimation 
John Kapenga, Western Michigan University; John Kloke, University of Wisconsin; Joseph McKean, Western Michigan University 
2:35 PM 

Optimal reconciliation of Constrained and Unconstrained Apparel Demand Forecasts using a Hierarchical Time Series Approach 
Ginger Holt, Walmart Labs 
3:05 PM 

Approximations of Markov Chains and Bayesian Inference 
James Johndrow, Duke University; Jonathan Mattingly, Duke University; Sayan Mukherjee, Duke University; David Dunson, Duke University
3:05 PM 

Causal inference from big data: Theoretical foundations and the data-fusion problem 
Elias Bareinboim, Purdue 
3:05 PM 

Robust Bayesian Inference via the Tilted Posterior 
Yixin Wang, Columbia University; David Blei, Columbia University 
3:35 PM 

Tuesday, 08/02/2016
Divide & Recombine (D&R) with Tessera: High Performance Computing for the Analysis of Big Data and High-Complexity Analytics 
Yuying Song, Department of Statistics of Purdue University; Bowei Xi, Department of Statistics of Purdue University; Ryan Hafen, Hafen Consulting, LLC; William S Cleveland, Department of Statistics of Purdue University 


Data Science: Bridging Academia and Industry 
Justin Dyer, Google, Inc.; Donal McMahon, Google, Inc. 


A NEW DISTRIBUTION TO DESCRIBE BIG DATA 
Yuanyuan Zhang, University of Manchester 


Bias correction in small samples from big data 
Jeffrey Chu 


Statistical Challenges in Big Data Analysis of the Hotel Industry 
Kai-Sheng Song , Department of Mathematics, University of North Texas 
8:35 AM 

How NOT To Do A/B Testing 
David Charles Draper, University of California, Santa Cruz 
9:25 AM 

Nonparametric Regression with Adaptive Smoothness via a Convex Hierarchical Penalty 
Asad Haris, University of Washington; Ali Shojaie, University of Washington; Noah Simon, University of Washington 
10:05 AM 

Model Calibration Utilizing Summary-level Information from External Big Data 
Nilanjan Chatterjee, Johns Hopkins Univ; Yi-Hau Chen, Academia Sinica; Paige Maas, National Cancer Institute; Raymond Carroll, Texas A&M University 
10:35 AM 

The ASA DataFest: Learning by Doing 
Robert Gould, UCLA Dept. of Statistics 
11:25 AM 

Explicit vs. Implicit Data: Comparing Responses from a Web Survey to Behavioral Data from Smartphones 
Noble Kuriakose, SurveyMonkey 
11:35 AM 

A Generalized Ordered Response Model 
Kramer Quist, Brigham Young University; James McDonald, Brigham Young University; Carla Johnston, University of California Berkeley 
12:05 PM 

A Generalized Ordered Response Model 
Kramer Quist, Brigham Young University; James McDonald, Brigham Young University; Carla Johnston, University of California Berkeley 
12:05 PM 

Trading Strategy Using Stock Moves Prediction and Sentiment Analysis 
Brahim Brahim, Big Data Visualizations Inc.; Sun Makosso-Kallyth, Degroote Pain Center McMaster University 
2:20 PM 

How Many Processors Do We Really Need in Parallel Computing? 
Guang Cheng, Purdue; Zuofeng Shang, Binghamton Univ 
2:55 PM 

Being Bayesian in a Big Data World 
David Banks, Duke University 
2:55 PM 

Analysis of Methane Data Collected by Google Street View Vehicles 
Zachary Weller; Jennifer Hoeting , Colorado State University; Adam Gaylord, Colorado State University; Joe von Fischer, Colorado State University 
3:05 PM 

Big Data Methods for Scraping Government Tax Revenue from the Web 
Brian Dumbacher, U.S. Census Bureau; Cavan Capps, U.S. Census Bureau 
3:35 PM 

Wednesday, 08/03/2016
A classroom data analysis project comparing 1960s local radio chart data to the national Billboard charts 
John Gabrosek, Grand Valley State University; Len O'Kelly, Grand Valley State University 


Statistical methods for Genomic Data Integration 
Veera Baladandayuthapani, The UT MD Anderson Cancer Center 
8:35 AM 

Time Delay Boolean Networks for Big Data 
Henry Lu, National Chiao Tung University 
9:20 AM 

Java as a platform for statistical computing 
Philip Steitz 
10:05 AM 

Data Science Education 
Constantine Gatsonis, Brown University; Alfred Hero, University of Michigan; Robert Kass, Carnegie Mellon University; John Lafferty, University of Chicago; Raghu Ramakrishnan, Microsoft Corporation 
10:35 AM 

PIE: Simple, Scalable and Accurate Posterior Interval Estimation 
Cheng Li, Duke University; Sanvesh Srivastava, The University of Iowa; David Dunson, Duke University 
11:20 AM 

On Safe Semi-Supervised Learning 
Kenneth Ryan; Mark Culp, West Virginia University 
11:20 AM 

Small-Area Estimation for High-Dimensional non-Gaussian Dependent Data 
Jonathan R Bradley, University of Missouri; Scott H Holan, University of Missouri; Christopher Wikle, University of Missouri 
11:35 AM 

Carpe Datum! Bill Cleveland's contributions to data science and big data analysis 
Steve Scott, Google Analytics 
11:35 AM 

Role of Functional Data Analysis in the Big Data Era: Applications to Precision Medicine 
Hulin Wu, University of Texas Health Science Center at Houston 
11:50 AM 

Scaling Up Statistical Models to Hadoop Using Tessera 
Jim Harner, West Virginia University 
11:55 AM 

Cognostics: Metrics enabling detailed interactive visualization of big data 
Barret Schloerke 
2:05 PM 

Covariance-Insured Screening Methods for Ultrahigh Dimensional Variable Selection 
Yi Li, University of Michigan; Ji Zhu, University of Michigan; Jiashun Jin, Carnegie Mellon University; Kevin He, University of Michiga; Yanming Li, University of Michigan 
2:30 PM 

A new class of measures for independence test with its application in big data 
Qingcong Yuan, University of Kentucky; Xiangrong Yin, University of Kentucky 
2:35 PM 

Subsampling for Feature Selection from Large Regression Data 
Yiying Fan, Cleveland State University; Jiayang Sun, Case Western Reserve University 
2:35 PM 

Epigenome Isoform Analysis with Applications 
Hongkai Ji, Johns Hopkins Bloomberg School of Public Health; Weixiang Fang, Johns Hopkins Bloomberg School of Public Health 
2:45 PM 

Identifying Typical Patterns and Atypical Behavior in Copious Amounts of Streaming Data 
Brett Amidan, Pacific Northwest National Laboratory; James Follum, Pacific Northwest National Laboratory 
2:50 PM 

Making sense of digital experiments with Bayesian nonparametrics 
Matt Taddy, Chicago Booth 
2:55 PM 

Thursday, 08/04/2016
Teaching Students to Work with Big Data through Visualizations 
Shonda Kuiper, Grinnell College 
8:35 AM 

Quadratically regularized functional canonical correlation analysis and its application to genetic pleiotropic analysis of multiple phenotypes 
Nan Lin; Yun Zhu, Tulane University; Fen Peng, University of Texas Health Science Center at Houston; Jinying Zhao, Tulane University; Momiao Xiong, The University of Texas Health Science Center at Houston 
8:50 AM 

Storage Issues and Assessment Arising from Large Scale Simulations 
Emily Casleton, Los Alamos National Laboratory; Joanne Wendelberger, Los Alamos National Laboratory; Jonathan Woodring, Los Alamos National Laboratory 
9:10 AM 

Interpretable High-Dimensional Inference Via Score Maximization with an Application in Neuroimaging 
Simon Vandekar, University of Pennsylvania; Philip Reiss, New York University; Russell Shinohara, University of Pennsylvania 
9:20 AM 

HVAR: High Dimensional Forecasting via Interpretable Vector Autoregression 
David Matteson, Cornell University; William B. Nicholson, Cornell University ; Jacob Bien, Cornell University 
9:25 AM 

Differentially Private Data Synthesis Partitioning for Big Data 
Claire McKay Bowen, University of Notre Dame; Fang Liu, University of Notre Dame 
9:45 AM 

Big, Deep, and Dark Data: Fundamentals, Research Challenges, and Opportunities 
Ivo Dinov, Statistics Online Computational Resource 
10:05 AM 

Cross-Disciplinary Minor in Data Science: a new paradigm for partnership across disciplines 
Andrew Schaffner, California Polytechnic State University; Alexander Dekhtyar, California Polytechnic State University 
10:55 AM 

 

0 comments
952 views

Permalink