ASA Connect

 View Only
  • 1.  Data files hosted on Amstat website disappearing

    Posted 09-13-2017 11:26

    Has anyone else had a problem with data files stored on Amstat website disappearing?  I use a data file on their website that is an example of Simpson's paradox published in Journal of Statistical Education. 
    Paper here:
    ww2.amstat.org/publications/jse/v22n1/mickel.pdf
    Dataset here:

    http://ww2.amstat.org/publications/jse/v22n1/mickel/paradox_data.csv

    The data
    has disappeared and been restored twice.  The IT department thinks that it's been identified as a corrupted file, so an automated process removes the file.  ASA doesn't seem interested in continuing to host the data file.   I can certainly distribute the file to my students myself, and teach them how to load the file that way, that doesn't help future instructors who want to use the data.  More likely, I will find another example of Simpson's paradox that is easier for students to load the data.

    I'm curious whether anyone else has experienced an issue with datafiles disappearing.



    ------------------------------
    Janet Rosenbaum, Ph.D.
    Assistant Professor, Department of Epidemiology and Biostatistics
    SUNY Downstate School of Public Health, Brooklyn, NY
    ------------------------------


  • 2.  RE: Data files hosted on Amstat website disappearing

    Posted 09-25-2017 09:57
    Stan Taylor and Amy Mickel made a great contribution to statistical education when published their article showing Simpson's paradox:  how age confounded the association between race/ethnicity and the expenditure of funds to developmentally-disabled Californians. Making their data available for others was an equally significant contribution.  Their contribution is increasingly valuable as statistical educators look for ways to include real data involving multivariable thinking and confounding in STAT 101 per the 2016 update to the GAISE guidelines. 

    Upon learning about the AmStat hosted data files disappearing, I contacted Stan and Amy.  Stan agreed to allow their files to be hosted on the statistical literacy website: www.StatLit.org.  In 2016, this free website had 280,000 visits and 470,000 downloads.  The Taylor-Mickel data, documentation and paper are now also available at the following locations:
    Data:  http://www.StatLit.org/XLS/2014-Taylor-Mickel-Paradox-Data.xlsx
    Documentation: http://www.StatLit.org/pdf/2014-Taylor-Mickel-Paradox-Documentation.pdf
    Article: http://www.StatLit.org/pdf/2014-Taylor-Mickel-JSE.pdf

    ------------------------------
    Milo Schield
    Editor and webmaster of www.StatLit.org
    Member, International Statistical Institue (ISI)
    US Director, International Statistical Literacy Project (ISLP)
    Vice-President of the National Numeracy Network (NNN)
    ------------------------------



  • 3.  RE: Data files hosted on Amstat website disappearing

    Posted 09-26-2017 09:44
    Edited by David Norris 09-27-2017 06:43
    The more modern way to reliably and permanently post data files and other materials—with licensing and automatically generated DOI's, no less—is via the Open Science Framework (OSF). Using OSF to host material for statistics education would have the additional advantage of introducing students to elements of modern scientific practice that have come to prominence amid the reproducibility crisis.

    ------------------------------
    David C. Norris, MD
    Precision Methodologies, LLC
    Seattle, WA
    ------------------------------



  • 4.  RE: Data files hosted on Amstat website disappearing

    Posted 09-27-2017 12:44
    Hello, all!

    We've been following this thread here at the ASA, and wanted to let you know we've identified the issue.

    This file was the only .CSV file on our .ww2 website that wasn't "temporary." .CSV files are created by various applications when people generate downloads. Once they are downloaded, those .CSV files are no longer needed, so we have an internal process that automatically cleans up .CSV files older than 30 days every night. This particular file (the only .CSV file on the website that *shouldn't* have been cleaned up and deleted nightly) was getting caught in that clean up process.

    We've marked it now so it's not caught in the clean up process. It should be available on a reliable basis by the end of the week.

    Thank you for your patience as we tracked down the solution to this issue!

    - Lara

    ------------------------------
    Lara Harmon
    Marketing and Online Community Coordinator
    American Statistical Association
    ------------------------------