Discussion: View Thread

  • 1.  Imputation methods in Establishment and Enterprise Surveys

    Posted 10-06-2011 11:28
    This message has been cross posted to the following eGroups: Statistical Consulting Section and Survey Research Methods Section .
    -------------------------------------------

    Dear all,

     

    Good day!

    I am to do a comparative study on the issue of "Imputation methods in entrperise and establishment surveys".

    What I need to know is the sampling design for such surveys, sample size, small and big establishment, Unit non-response, Item non-response, and of course the method of imputation used for such surveys.

     

    I want to know other countries experiences on the issue to choose the best method for my work. You may know what I want is sth special and applicational at the province and country. I did some searches but not satisfied. Your country experiences in this matter would of very help for me.

     

    Your help in case of any papers, any documents or any homepage and any source would be kindly appreciated.

     

    Bunch of thanks,

    Amir


    -------------------------------------------
    Amir Kasaeian
    PhD Student in Biostatistics
    Tehran University of Medical Sciences (TUMS)
    amir_kasaeian@yahoo.com
    akasaeian@razi.tums.ac.ir
    -------------------------------------------


  • 2.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-06-2011 12:00
    Your dissatisfaction with what you have read so far may be due to the fact that there are no cut and dried answers.  Whether you are dealing with survey data or any other form of statistically valid data the proper method depends on the reason the data is missing.  Some methods such as last observation carried forward f(LOCF) or longitudinal data are only good under very restrictive conditions and there are other methods that are always as good or better. The only justification for LOCF is its simplicity.  So under the appropriate conditions it can be used for simplicity.  Methods such as multiple imputation or the use of mixed effects linear models are appropriate under missing at random restrictions.  When you have nonignorable missingness whcih does occur a lot in practice, you must find a way to understand the mechanism for missingness.

    If this terminology looks foreign to you, I highly recommend that you read "Statistical Analysis with Missing Data 2nd Edition" by Rod Little and Don Rubin.  The second edition was published by Wiley in 2002.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 3.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-06-2011 14:26

    I would only add to Michael's comments an emphasis on the VERY, under his notes that LOCF (and BOCF and most single imputation methods) operate under VERY restrictive conditions. A recent panel of the National Science Academy on The Prevention and Treatment of Missing Data (chaired by Rod Little) concluded: 

    "in nearly all cases, there are better alternatives to LOCF (last observation carried forward) and BOCF imputation"

    Full report is  nicely available for pdf download at:
    http://www7.nationalacademies.org/cnstat/News%20from%20CNSTAT%20July%2029%202010.pdf

    I would add that for the practicing consultant, often the answer once again is "well, it depends" on things like how much missing data there is, how severe the risk of informative missingness is, how many other predictors you have to try and rely on MAR assumptions, etc. I completely concur with the report that there are nearly always better methods than LOCF\BOCF, but I take these other issues into account in the consultant's balance of getting deliverables to a collaborator which are statistically sound, but also logistically feasible. Also, I usually run more than one model to check assumptions. It is usually easy to do things like best-case\worst-case informative missingness examinations which, though draconian can offer insights into how sensitive your conclusions are to the missingness structures and assumptions. A short table summarizing results across different missingness mechanism assumptions is then transparent and clarifying.

    -------------------------------------------
    Michael Griswold
    Executive Director
    Univ MS Medical Center Biostatistics
    -------------------------------------------








  • 4.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-06-2011 14:39
    I agree wholeheartedly With all of Michael Griswold comments from both of his messages. I certainly meant VERY when I said very restrictive assumptions and it applies to BOCF and other simplistic approaches as wellas LOCF.  But for years and possibly even to this day the FDA has accepted LOCF in regulatory submissions.  An Amgen study comparing LOCF to multiple imputation I think has led to a change in attitude.  Don Rubin is even on their advisary committees.  So multiple imputation is getting much more use. In agreement with Michael Griswold's comments, the FDA these days likes to see more than one imputation method used as a sensitivity analysis for the method in use.

    I think we have covered the theory well enough for our colleague from Iran.  I think he is more interested in actual applications.  So hopefully some of you can provide a few good stories.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 5.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-07-2011 08:33

    Dear all,

    Thank you so much for your kind attention and your informative comments. That's good.

    But what I need to know is the experiences of other countries in the real situations when they prepare a design for establishment surveys. How they do it and with how many samples? When finishing the survey, which methods of imputation they use for the missing? What do they do when encounter e.g. by sth more than partial missing? I mean what do they do with Item non-responses and what do they do with Unit non-responses? Estimating response and non-response rates and so on.

    That's sth far from theoretical context I think. What I want is to feel and palpate such kind of situations.  Maybe reports of international statistical agencies on these issues can help me more. Then for suggesting the best way for surveys' problems here in Iran it could be of very importance.

    Introducing

    Enthusiastic to receive your guidance again.

    Cheers,

    Amir


    -------------------------------------------
    Amir Kasaeian
    PhD Student in Biostatistics
    Tehran University of Medical Sciences (TUMS)
    amir_kasaeian@yahoo.com
    akasaeian@razi.tums.ac.ir
    -------------------------------------------








  • 6.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-07-2011 08:55
    I hope that someone with more experience on this than I have can help Amir.  What is important is not just what they do but also why they do it a certain way.  The theory is very important to guide practice.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 7.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-10-2011 04:17

    Hi all,

    What Dr. Chernick said is what actually I need (i.e. Why @ what). I think someone with great experiences in the field of official statistics can help me more concerning this issue.

    I'm looking forward to receive your comments and guidelines.

    Rgds,
    Amir
    -------------------------------------------
    Amir Kasaeian
    PhD Student in Biostatistics
    Tehran University of Medical Sciences (TUMS)
    amir_kasaeian@yahoo.com
    akasaeian@razi.tums.ac.ir
    -------------------------------------------








  • 8.  RE:Imputation methods in Establishment and Enterprise Surveys

    Posted 10-06-2011 12:15
    To add to my earlier response, I do not think there is anything specific in the survey design that can help with the missing data problem.  What a good design can do is make missing responses less likely because the questions are easy to answer and are unambiguous.  Also the length of a survey is a factor in whether or not a respondent will answer all the questions.  I thinks these are more common sense issues than statistical ones.  Although I have no specific citations I know there are a number of references on good survey design.  The only comment I have on sample size is that the larger the sample size is the more likely it is that you will have missing data.  But there you are trading off dangers of missing data with accuracy of estimates.  Also most programs that calculate required sample size are assuming complete data.  A simple way to adjust would be to increase the sample size to account for missing data by assuming that a certain percentage of missing responses will occur.

    I know you are interested in seeing what people have done in real applications and hopefully others can provide that.  I am only trying to address the methodological issues in general.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------