ASA Connect

 View Only
  • 1.  CFA and small sample size

    Posted 06-17-2017 00:04
    Hi everyone, 

    The literature on Confirmatory Factor Analysis (CFA) suggests that large sample sizes are needed for this type of analysis. For example, an absolute sample size of 50 or more, etc.

    I am working with a survey dataset where the sample size is much smaller: n = 28. The survey has a total of 50 items grouped in 9 dimensions (or belonging to 9 subscales). Each dimension is measured via k items, where k can be as small as 3 and as large as 11. (The last four dimensions are seemingly attempting to measure the same underlying construct.) All items are measured on a 5-point scale and the majority of the respondents chose the highest values on this scale (i.e., either 4 or 5).  

    One of the answers posted on CrossValidated states: "If you know that you have several subscales, then you should fit a CFA that has that many factors".  

    My question is:

    With such a small sample, is it acceptable to fit separate single-factor models to each of the 9 subscales rather than a 9-factor CFA (or perhaps 10-factor CFA)? If not, are there any other alternatives to be considered? (It seems to me that trying to fit a full-blown CFA model to such a small data set is just not a good idea.)  

    Thanks in advance for your thoughts,

    Isabella

    ------------------------------
    Isabella Ghement
    Ghement Statistical Consulting Company Ltd.
    ------------------------------


  • 2.  RE: CFA and small sample size

    Posted 06-19-2017 10:56
    Hi Isabella,

    I very rarely give this sort of answer to clients, but with that sample size I think you should forego any CFAs or even EFAs. Compounding your small sample size here is the fact that your items are very skewed (most of respondents choosing 4 or 5). The problem there is that your items have extremely low variance which results in weak covariances and thus correlations. Factor analysis only analyzes the correlation matrix (it can also be done on a covariance matrix). Because of the skewness, the correlations are attenuated. Because of the small n, there is a great deal of sampling error in your estimates of each of the correlations. Then based on those correlations, you'd be estimating mp+p parameters of a factor model. You could not expect meaningful results from this.

    I have done some simulation research on necessary sample size for performing EFA on dichotomous data. Your data is virtually dichotomous as you described it. In a best-case scenario (single-factor models with many variables per factor and all variables having a 50/50 distribution and high true factor loadings with the common factor) a sample-size in the 20s was marginally sufficient to properly estimate the factor model parameters. You're data would be more similar to an 80/20 distribution for items that I examined. In that case with all other best-case aspects mentioned above, an n in the 40s or 50s was minimally sufficient. Below is a link to the paper. If you choose to look at it you'll be in rare company :) .

    Pearson & Mundfrom (2010)

    ------------------------------
    Robert Pearson
    Asst. Professor
    Grand Valley State University
    ------------------------------



  • 3.  RE: CFA and small sample size

    Posted 06-19-2017 21:45
    Dr. Pearson makes some excellent observations.  His approach might be the more conservative and appropriate, especially if your client does not have a grasp of the risk associated with potentially problematic information.

    But, with that caveat, you can certainly estimate the single factor models (either EFA or CFA) to get some idea of the degree to which the items "hang together" as scales.  .  You could also estimate the 9 factor CFA, although here the number of parameters estimated -- even if you assume that error variances are equal -- will swamp the number of cases that you have.  Thus your risk of misleading findings is considerable, but you will get some (inherently flawed) evidence as to the purity of the scales, i.e., the presence or absence of cross-loading across dimensions.  

    If I had to speculate -- given the data that you have provided -- I'd go so far as to say that your client has written an instrument that is plagued with social desirability and thus produces the "accepted" top box and top 2 box responses -- hence the highly skewed distributions.  My guess is that the set of single factor EFAs will look acceptable, and that a 9 factor CFA will look OK (due to the small N) but with evidence of lots of cross-loadings across the dimensions.

    Bottom line: I doubt that this is a problem of statistics per se but rather one that ties to skill in drafting an unbiased research instrument.


    ------------------------------
    David Mangen
    ------------------------------



  • 4.  RE: CFA and small sample size

    Posted 06-20-2017 09:38
    ​"To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of."
    • Presidential Address to the First Indian Statistical Congress, 1938. Sankhya 4, 14-17.
    I copied the above from https://en.wikiquote.org/wiki/Ronald_Fisher

    ------------------------------
    Emil M Friedman, PhD
    emilfriedman@gmail.com
    http://www.statisticalconsulting.org
    ------------------------------



  • 5.  RE: CFA and small sample size

    Posted 06-20-2017 11:00
    With some further thought on this matter, if you do elect to move forward with any estimation I'd suggest starting with some very conservative models such as parallel measures or tau equivalence approaches.  This way you can see how much a conservative model that requires relatively few parameters improves the model fit in comparison to the baseline.  These can be easily specified in, e.g., LISREL and presumably other packages.

    ------------------------------
    David Mangen
    ------------------------------