Discussion: View Thread

Basic Question About Maintaining Data and Code for Consultants

  • 1.  Basic Question About Maintaining Data and Code for Consultants

    Posted 10-21-2013 21:30
    Hi all!

    I am fairly new to this forum and to statistical consulting in general.  I have a question that may be very basic and/or have been addressed before...  I hope you won't laugh too hard at my inexperience...  :-)

    What is common practice related to client data and data analysis code (e.g. SAS)?  Do you typically delete the data, but retain the code once the project is complete?  Or is there another protocol that is typically used?

    I appreciate any input or experience you are willing to share.

    Thank you!
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------


  • 2.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-21-2013 22:05
    Hi Alicia

    No, this is not a basic question and is one that continues to trip up even the most experienced consultant.

    The answer is that 'it depends'

    It depends on what your contract says.  Hopefully this is all spelled out in the consulting contract.
    You really don't want to be in the business of being a data repository for all of the consulting jobs that you may ever have.

    Also, depending on the industry that you are or may be working in, the client may clearly own the data, the coding, the analysis, etc.
    and you will find that you will be required to attest that you have returned or obliterated all copies of data and coding in any form.
    Again, get it clear in the contract.

    Many grants, contracts lay out what must be done with the data, how long it is to be stored, in what form, how or if it is
    required to be released at some later point, etc.  This is especially true if the base contract/grant is a Federal one.

    So always ask, where is the money coming from to support this project.  Also ask if there are grant/contract requirements
    for data.  The PI may not always understand these issues.

    There is also the issue of coding and who 'owns' the code.  If the task requires unique coding, who owns it.
    What does the contract say?

    Also as you code and develop or implement your data bases it is imperative (can't say strongly enough) to document, document
    document.  Clear and comprehensive documentation is a pain to do.  However, I can't tell you the number of times that
    I have been saved by clear documentation when a client returns a year or two after the initial project or publication and says, OK
    now I want to reuse the data and reformat some of the analyses for a new purpose.  Without the clear coding I would likely
    never remember exactly what was done.

    Get it in writing, make it part of your routine in accepting a task or job.

    Remember, get it in writing.  A verbal agreement is not work the paper it's printed on.....

    Even the 'smallest' job should at least have a memo of understanding.

    Good luck and welcome to the jungle :-)
    -------------------------------------------
    William Grant
    Professor, Emergency Medicine
    SUNY Upstate Medical University
    -------------------------------------------








  • 3.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 00:52
    Dear William,

    Thank you so much for your response!  I am currently writing my first contract (for my previous work I have been given a contract by the client), so I will be sure to spell out the things you mentioned.  And keep good records and documentation!  :-)

    Best,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------








  • 4.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 08:19
    I'm in complete agreement with William on this.  A contract / memo of agreement / non-disclosure agreement should spell out how the data and code is stored when not in use, the length of time that involved parties can have exclusive rights to the data, code and derivative products, and if/when the work can be used by the consultant beyond the project (perhaps for other clients).

    If the project or analysis leads to something truly groundbreaking, it is also worth thinking about issues such as patent and publication rights ---- hopefully before the contract was signed..

    Depending on the complexity and cost of the project, having a contract lawyer review documents before you sign can be a good option to help protect your interests.

    -------------------------------------------
    Mark Lancaster
    George Mason University
    -------------------------------------------








  • 5.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-23-2013 01:13
    Hi Mark,

    Thanks so much for your feedback!  I hadn't thought about things like publication rights.  I will consider how to cover that.  Do you have any advice for finding a good contract lawyer?

    Thanks,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------








  • 6.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-21-2013 22:09
    Excellent question. In my personal opinion, many people would like to ask that question, but fear to seem clueless. Since you are somewhat new to things, you are in the excellent position of asking questions that many wish to ask, but fear to.

    I direct a small biostat and epidemiology group. We use SAS. We do the following:
    1) All projects are run using a specific project shell. The project shell contains subdirectories for data (excel, sas, raw), output (figures, stat results, tables) and programs (macros, formats, program). Thus, each project is structured with a parallel environment. For every project, excel data forms with raw data are always in the same relative location.
    2) Each project is begun by copying the project shell. Names of the program file, format file, and macro file are tailored to the project.
    3) The program file contains a macro shell to run programs. The macro shell contains macro file directory names. By changing 2 items in the macro shell, the entire project is set up to write output to the write place, read data from the right place, and define the SAS dataset subdirectory.
    4) All programs are written in the macro based environment. This facilitates easy use of macros to perform parallel analyses on multiple variables.
    5) The formats are all written to the format file. That way, they are always in one place.
    6) Macros which are built to be used along with the main analysis macro go into the macro file.

    At the end of each project, the entire subdirectory is simply moved from the "current" subdirectory to the "done" subdirectory. The SAS datasets, excel spreadsheets, output for memos, memos containing output and so forth are all saved in the subdirectory. Never throw away data.

    Never throw away data.


    -------------------------------------------
    Paul Thompson
    Director, Methodology and Data Analysis Center
    Sanford Research/USD
    -------------------------------------------








  • 7.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 00:58
    Dear Paul,

    Thank you very much for your response!  I really appreciate you sharing your protocol.  It sounds like you have a great system.  I hope to be as organized myself...  It can be intimidating to reveal one's lack of knowledge, but it is the only way to learn...  :-)

    Best,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------








  • 8.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 06:22
    Great advice on naming for file storage.  Thanks, Paul.

    In a related fashion, I do something similar with file names.  All analyses for one data set are named based on the data file, with one command file to read, another to groom the data (correct errors, add labels, do transformations, etc.), and one or more to do various analysis files, So if the data file from the clients is named client_data_file.xls, the other files might be:
    client_data_file_read.sps
    client_data_file_groom.sps
    client_data_file_analyze.sps

    Any data files I save are named similarly, such as:
    client_data_file.sav
    client_data_file_groomed.sav

    Output files are similarly named.

    That way, all the analyses for one data set are easily identifiable in a search of my hard drive.

    As to saving data, that depends on the project, as others have said.

    The problem I wrestle with is dealing with computer backups in purging data.

    How do you do that?

    Joel

    -------------------------------------------
    Joel Wiesen
    Director
    Applied Personnel Research
    -------------------------------------------








  • 9.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 21:29
    Hi Joel,

    Thanks for your response!  It sounds like you have a good system too.  As for the issue of purging data and computer back-ups...  perhaps maintain the data on a separate external drive that isn't backed up with the rest of your system?  Just an idea...

    Thanks,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------








  • 10.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-22-2013 21:41
    One more option to consider, is to have your client provide internet access, and a method to store scripts, data files, output, etc. on the client server/resource.


    -------------------------------------------
    Chris Barker, Ph.D.
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www,barkerstats.com

    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    -------------------------------------------








  • 11.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-29-2013 20:38
    Agreed.  I find this approach works really well for one of my current projects.  It is especially great for cases where you are working with data containing confidential information.

    Thanks for your response,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------








  • 12.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-30-2013 09:02
    In a post last week, I described the method for structuring analyses and projects that is used in my group. If you are interested in a copy of a short .pdf, let me know.

    -------------------------------------------
    Paul Thompson
    Director, Methodology and Data Analysis Center
    Sanford Research/USD
    -------------------------------------------








  • 13.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-24-2013 08:52
    I work in an academic teaching hospital and have internal clients.   I give them everything at the end of the project, but I've started to make them "back-up" files and give to them throughout the project.  Should something happen to me, they can have the files.
    I also have a standard set of folder and labelling.  We will work from these files in meetings so that they begin to understand the system.  These are relatively small files and fit on CDs.  We still have computers that read CDs.
    .. 

    -------------------------------------------
    Dorothy Syblik
    -------------------------------------------








  • 14.  RE:Basic Question About Maintaining Data and Code for Consultants

    Posted 10-29-2013 20:42
    Hi Dorothy,

    Thanks for your response.  In addition to providing all the files to the client, do you archive a copy "long-term?"  The 'back-up' copies sound like a great idea.  I am resisting upgrading to a machine that doesn't have a CD drive....  :-)

    Thanks,
    Alicia

    -------------------------------------------
    Alicia Hansen
    Statistical Consultant
    -------------------------------------------