ASA Connect

 View Only
  • 1.  What stats classes should a data scientist take?

    Posted 01-08-2018 10:05
    Suppose a student working towards an MS in Comp Sci or some other related degree told you they had 2 classes they could take from the stats department as their cognates. They ask you what they should take to be a better data scientist. What classes would you suggest and why?


    Suppose you were on a committee for a university and they asked you what stat methods students in the new MS Data Science will need. Those methods would become part of a couple "overview" courses that will be targeted at the MS DS students. What methods would you suggest? (You may assume that incoming students took an intro to stats class that covered material up to simple ANOVA.)  

    Suppose that the program also wants to have 2-3 stats classes as electives. Knowing that the MS DS students will take the overview courses, what stats (or related) courses would you suggest?

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------


  • 2.  RE: What stats classes should a data scientist take?

    Posted 01-09-2018 10:07

    Southern Methodist University in Dallas, Texas, has been running a Masters in Data Science (since 2015) and we are in conversation about a bachelors in data science, too. Your question is timely!


    For masters' students, I would suggest a course on regression and multivariate statistical methods. The regression course should cover lots of diagnostics and graphics. It should also including ridge regression, partial least squares, principal components, robust regression, multidimensional scaling, and related concepts (see Alan Izenman's book on multivariate statistical methods: https://astro.temple.edu/~alan/MMST/index.html).


    I also think a course in nonparametric statistics would be useful. This could include rank-based methods, and computational methods like the bootstrap, jackknife, etc as well as nonparametric regression.


    A course on data visualization would also be good - one could argue that it would be more useful than either of the other two that I just mentioned.


    I know you asked about just one or two courses and not a whole curriculum, but since there are curriculum guidelines available, they may help you determine what courses would be most useful: https://www.amstat.org/asa/files/pdfs/EDU-DataScienceGuidelines.pdf


    Monnie McGee, PhD
    Associate Professor
    Statistical Science
    Southern Methodist University
    Office: 214-768-2462
    Fax: 214-768-4035





  • 3.  RE: What stats classes should a data scientist take?

    Posted 01-09-2018 10:40
    ​I would recommend courses based on the excellent text,

    "An Introduction to Statistical Learning: With Applications in R." by James, Witten, Hastie, and Tibshirani.

    It is a "lighter" version of "The Elements of Statistical Learning" by Hastie and Tibshirani. 

    I believe the introductory book was specifically written for they types of courses you are asking about.

    ------------------------------
    Bob Lucas
    Principal
    Robert M. Lucas Consulting
    ------------------------------