ASA Connect

View Only

Back to eGroups

Expand all | Collapse all

Should we impute missing data while presenting descriptive stat?

1. Should we impute missing data while presenting descriptive stat?

0 Recommend
Md Abdullah Mamun
Posted 07-26-2016 01:38
Most of the proposed methods for missing data imputation are guided to regression analysis. Should (or Can) we impute missing data while the objective is merely to present some descriptive statistics (mean, SD, mode) in the preliminary tables of a manuscript? If yes, which method is appropriate? I am familiar with mean imputation, stochastic imputation, and multiple imputation. Given that the missing data met the MCAR or MAR criteria.

Thank you in advance,

Mamun
------------------------------
Md Abdullah Mamun
PhD Student
UNTHSC
------------------------------
2. RE: Should we impute missing data while presenting descriptive stat?

1 Recommend
James Knaub
Posted 07-28-2016 09:11
Mamun -

Perhaps it's just semantics, but it seems to me that if you give descriptive statistics for a data set, and there are missing data, then this would actually require inference.

Regression is one important area.

Also, even if you are just looking at missing completely at random, it would seem that an example where you need to impute would be if you are showing totals, not just means, by group.

There are many appropriate methods.

Perhaps I misunderstood your question?

Cheers - Jim
------------------------------
James Knaub
Lead Mathematical Statistician
Retired

Original Message
3. RE: Should we impute missing data while presenting descriptive stat?

0 Recommend
Emil Friedman
Posted 07-29-2016 05:09
A lot depends on the context. When I worked for <proprietary> R&D I sometimes wrote formal reports saying things like, "This suggests.......However, <here I described the caveats>. To resolve these issues we could <describe future experiments>." Then they started worrying about paper trails and lawsuits. My present job is in a quality department of a regulated industry where I would be worried about putting something like that in an email.
------------------------------
Emil M Friedman, PhD
emilfriedman@gmail.com
http://www.statisticalconsulting.org

Original Message
4. RE: Should we impute missing data while presenting descriptive stat?

1 Recommend
Rachael DiSantostefano
Posted 07-29-2016 08:55
I would prefer to see the ';actual' data in that demographics or descriptive table, including how many items are missing [say, n=500, but for some things its n=475]. It shows just what's missing and extent of it before you apply any assumptions on missing data and how to impute it.

Context is important too (on appropriateness of imputation and methods) as others have written. If you later did some imputing so that you could retain things in the model (for example), then you can describe imputation and why you chose certain methods.

For example -- When I run recursive partitioning (in R), there are default assumptions for filling in missing data. I usually use those and try others (mode, etc) to see how sensitive results are to if and how I've imputed.

Good luck!

Rachael
------------------------------
Rachael DiSantostefano

Original Message
5. RE: Should we impute missing data while presenting descriptive stat?

1 Recommend
Eugenie Coakley
Posted 08-01-2016 08:09
I agree with the comment that presenting at least one descriptive table that portrays the actual data and including a count of the number of missing values for each variable. It helps the reader understand the extent and nature of missing-ness and why imputation was applied.
------------------------------
Eugenie Coakley, MA, MPH, PState
Senior Consultant/Statistician
John Snow, Inc.

Original Message
6. RE: Should we impute missing data while presenting descriptive stat?

1 Recommend
Nora Galambos
Posted 08-01-2016 09:03
I agree with Rachel and Eugene. A table that describes the data as is, prior to imputation, along with listing the missing frequencies is important. Down the road you do not want anyone to wonder if you were covering up a serious missingness problem. Depending upon the imputation method, demonstrating conditions like missing at random may be important.
------------------------------
Nora Galambos
Senior Data Scientist
Stony Brook University

Original Message
7. RE: Should we impute missing data while presenting descriptive stat?

0 Recommend
Julie Tackett
Posted 08-02-2016 14:04
Yes, if it is feasible. As far as I can determine, in this situation, you can only impute variables on X = independent variables easily. I tried to impute variables on Y/X using the Regression Imputation Equation, but it is time-consuming and inexact.

Original Message
8. RE: Should we impute missing data while presenting descriptive stat?

1 Recommend
Joseph Nolan
Posted 08-03-2016 08:11
Imputation, and statistical results in general, by nature are never "exact". Imputation is really just another form of (or an addition to) modelling. It has advantages (increased sample size) and disadvantages (e.g. increased variability) when used in conjunction with a statistical model.

I see little purpose to imputing data for the purpose of descriptive statistics. If one of my clients asked me to do this, I'd ask them why? Odds are they would be hoping to somehow make the descriptive statistic more useful (quite possibly wanting to draw inference from it) - which would lead to an education moment on just how a descriptive statistic is (or more likely is not) useful.
------------------------------
Joseph Nolan
Associate Professor of Statistics
Director, Burkardt Consulting Center
Northern Kentucky University
Department of Mathematics & Statistics

Original Message
9. RE: Should we impute missing data while presenting descriptive stat?

0 Recommend
Julie Tackett
Posted 08-03-2016 12:05
I do appreciate all the comments. I see the pros/cons of imputing data for presenting descriptive stats in a paper, although I believe that the more closely the original data from the full sample represent the population allows us to more fully give the reader a picture of the data--is it skewed, what are the outliers-- how much dispersion in the data, and so forth, so that will inform us better to do inferential statistics.
------------------------------
Julie Tackett

Original Message
10. RE: Should we impute missing data while presenting descriptive stat?

0 Recommend
Jonathan Siegel
Posted 08-04-2016 14:49
In general I would not. I think the most important reason is that fhe existence and extent of missing data represent important descriptive facts about the phenomena being observed and/or your observation methods, and hence are relevant to any effective description. If a large portion (or only a small portion) of the data are missing, this would be important to know and relevant to the data's interpretability.

I agree that, because imputation methods require assumptions, they go beyond simply describing the data observed. In addition, they may result in underestimating, perhaps substantially, variances and CIs, and hence in overestimating precision.

There me are situations where it is especially important to avoid imputation in ones descriptions. In situations involving heavy tails, For example, outliers such as long-term survivors, uncommon toxicity events, very high-income individuals, stock-market crashes, etc., are often very important to an effective description of the phenomenon being evaluated. Imputation methods tend to result in under-reporting or underestimating the impact of such phenomena.
------------------------------
Jonathan Siegel
Associate Director Clinical Statistics

Original Message

ASA Connect

Should we impute missing data while presenting descriptive stat?

Md Abdullah Mamun07-26-2016 01:38

James Knaub07-28-2016 09:11

Emil Friedman07-29-2016 05:09

Rachael DiSantostefano07-29-2016 08:55

Eugenie Coakley08-01-2016 08:09

Nora Galambos08-01-2016 09:03

Julie Tackett08-02-2016 14:04

Joseph Nolan08-03-2016 08:11

Julie Tackett08-03-2016 12:05

Jonathan Siegel08-04-2016 14:49

1. Should we impute missing data while presenting descriptive stat?

2. RE: Should we impute missing data while presenting descriptive stat?

3. RE: Should we impute missing data while presenting descriptive stat?

4. RE: Should we impute missing data while presenting descriptive stat?

5. RE: Should we impute missing data while presenting descriptive stat?

6. RE: Should we impute missing data while presenting descriptive stat?

7. RE: Should we impute missing data while presenting descriptive stat?

8. RE: Should we impute missing data while presenting descriptive stat?

9. RE: Should we impute missing data while presenting descriptive stat?

10. RE: Should we impute missing data while presenting descriptive stat?

Contact Us

Membership

Privacy

Follow Us