Discussion: View Thread

PCA of transformed Likert-Scale data

  • 1.  PCA of transformed Likert-Scale data

    Posted 03-09-2013 09:01
    I teach a graduate statistics class, and students are required to do data analysis projects. I have one student who has 7-point Likert scale questionnaires (Disagree Strongly 1 ... Agree Strongly 7) about chemistry students' attitudes about different aspects of chemical safety and the personal assessment of risk of different chemicals. Her group has given the questionaire to many classes, some trained in principles of green chemistry and others with the standard curriculum. She wants to test for differences between students that have been trained in green chemistry techniques and others that have gone through the standard chemistry curriculum. She also wants to analyze the questionnaires to summarize the responses, and to note how student perception of chemicals and chemical risk factors are affected by training in green chemistry.
      I'm an ecologist who uses PCA a lot. I can suggest to her that she do a PCA on the data, examining the correlation biplot to see which questions have strongly correlated responses (positive and negative). She can analyze the Euclidean distance biplot to see how the classes differ and whether there is a distinct difference between green and non-green classes. An exploratory factor analysis could indicate whether there are discrete factors, underlying the student responses on the many questions. Differences between the two groups of students might be assessed after the factor analysis by assessing group differences on the factor scores. Or the group differences in Likert Scale responses can be modeled explicitly by using redundancy analysis in which the questions that differ between groups are clearly identified.
      In ecology, we'd never analyze the untransformed data.  My question is, "Are there standard transformations of Likert Scale variables that are carried out prior to performing multivariate analysis?" My student can do the analysis untransformed, but perhaps the student x question data should be row normalized (sum of squared scores for each student equal to 1) prior to analysis. Students may differ in terms of the variability assigned to questions (one students 1 to 7 may be another's 3 to 5). Similarly, prior to PCA, the student x question data should be centered by subtracting the mean or the 1st axis will simply reflect the mean scores and not the student-to-student variance. A simultaneous row and column standardization could lead to a correspondence analysis of the Likert-scale data.
      There might be some definitive reference or set of references on the transformation of Likert-scale data for multivariate analysis. If so, could you post it for me and my student.  A google scholar search didn't come up with much useful.

    -------------------------------------------
    Eugene Gallagher
    Associate Professor
    Univ of Massachusetts
    -------------------------------------------


  • 2.  RE:PCA of transformed Likert-Scale data

    Posted 03-09-2013 14:58
    Dear Eugene, Here are a couple of comments. 1. You may want to look closely at what the experimental units should be for this study. The treatments are the teaching of "green" or "no green". To what do they get applied? I see the experimental units as the classes, whereas students are measurement units. A useful trick to see this is to note that an analysis by students would want to use classes as blocks, but you find it doesn't work because all students in the same class have the same treatment. Once you agree that classes are the experimental units, then the scale issue goes away. You take the measurement on the classes to be the average score of the students in the class (for one question or a set of questions). You can then analyze these class measures by standard methods for normal distributions. I suspect you may not have too many classes so this might be a little frustrating, but if your treatment effect is large compared with class standard error, you might get something. If I'm wrong about classes all having the same treatment and they are more of a sampling unit or a block with some students in a class having and not having the green training then students would be your experimental units. 2. If you still want to analyze with students as the experimental units, I have found that such scales don't carry much more information than a simple binary response. I would suggest reducing the Likert Scale to a binary response of agree vs. disagree or greater than the mean response vs. less than the mean response (this is a transformation). You can now analyze by the many methods for binary responses. I always liked logistic regression as a method for comparing treatments, especially if you have some covariates measured on the students. Good luck! ------------------------------------------- Bill Stewart Distinguished Biostatistician EMB Statistical Solutions, LLC. -------------------------------------------


  • 3.  RE:PCA of transformed Likert-Scale data

    Posted 03-09-2013 16:10

    Following option #2 of the previous post (Stewart), one could treat your data as a multinomial data with cum-logit link,
    and incorporate within-class correlations.

    Nagaraj

    -------------------------------------------
    Nagaraj Neerchal
    Professor and Chair
    UMBC
    -------------------------------------------








  • 4.  RE:PCA of transformed Likert-Scale data

    Posted 03-09-2013 17:03
    While I missed the earlier part of this discussion and I am not quite clear about the study design, I assume that this is a "cluster randomization" issue, in which the unit of treatment assignment is the class, not the students.  In this type of design, the unit of analysis can be the student rather than the class, by applying a method of analysis that adjusts for  "intra-cluster correlation", or the "cluster effect".  This will, of course, inrease statistical power for the analysis. One method of analysis that can be used to adjust for the cluster effect is mixed-model factorial Anova (proc mixed), in which treatment and class are fixed effects and students are classified as random effects. 
    -------------------------------------------
    Edith Zang
    Independent Consultant
    NYCASA
    -------------------------------------------








  • 5.  RE:PCA of transformed Likert-Scale data

    Posted 03-11-2013 12:06
    Classical psychometric methods would simply perform a PCA on the untransformed Likert items to identify the underlying latent structure for the questionnaire. Normality is unimportant, since we are not conducting a formal statistical test for the fit of the latent model to the data. Factor scores could be generated from the final rotated component solution that she could use to compare the two groups of students (conventional vs. green chemistry instruction), just as you suggest.

    The optimal approach to her problem, however, is to apply a Rasch item-response model for ordered categorical data to evaluate the items in terms fit, unidimenionality and their ability to discriminate students at different levels of the underlying trait that she is attempting to measure. Principal components on the person-item residuals will provide information about residual structure. You can use these findings to identify the potential latent traits. Reorganize the Rasch analyses around the subsets of items associated with each potential latent-trait and revaluate the item fit and mapping of the item responses to levels of the latent trait.

    The beauty of the Rasch approach is that it employs a mixed effect generalized linear model that allows incorporation of additional variables like the green vs. non-green chemistry. You not only get a test of differences at the level of the overall score, but you also get information regarding differential item functioning (group differences at the level of the individual items) that may be more interesting and informative in this case.

    Admittedly, this approach may be beyond the content covered in your course. There are a number of software packages specifically designed to handle these types of analyses, but any software package that included mixed effect generalized linear models can be programed to fit a Rasch model. I've use the eRm package in R to fit Rasch models to Likert scales, but I know that folks have written SAS macros and SPSS macros to do these analyses as well.

    Your proposed approach is the simplest and most efficient for her immediate purposes. However, there is no need to perform an exploratory factor analysis on top of the PCA. The scores generated from the PCA are determinant, while factor scores generated from an exploratory factor analysis are indeterminate (subject to greater sampling variability).


    -------------------------------------------
    John Cornell
    Professor
    University of Texas Health Science Center
    -------------------------------------------








  • 6.  RE:PCA of transformed Likert-Scale data

    Posted 03-11-2013 14:01
    There are no transformation done with Likert items, a priori, but the correlations implicitly standardize the items across cases.  Ipsatizing the cases seems like overkill for students.   Ipsatizing might be of interest in very advanced psychometric studies.

    Conventionally, Likert items are treated as  not severely discrepant from interval. Their sum is treated as not severely discrepant from  interval level. 


    There are two major approaches to factor analysis.  Both major approaches use the iter-item Pearson correlation matrix.  The difference is largely in what is used on the principal diagonal.    The kind used in attitude scale construction,  principal axes (PAF),  is interested in the common variance among the items and treats the variances that are unique to the items as noise so it uses estimated of relaibility (usually the squared multiple correlation of each item with the other items).  The other approach, principal components (PCA) assumes that all of item variance is of interest.  [PCA is more frequently used in contexts where the variables are measured at the ratio level, e.g., wit physical constructs.

    Each item is considered a rough measurement of a construct.  The sum of the items is considered a more valid and reliable measure of the construct.  There is no substantive meaning to zero in attitude measures. 


    There are refinements and variation in approach that differ according to specifics of the situation.

    If the attitude items are parts of pre-existing validated Likert scales, designed to measure specific constructs, a different approach would be used. 

    Assuming that the are two sets of items that are designed to explore what structure there might be within each set, and further to see whether the two groups differ in where they are located on the derived dimensions, I would stick with a very conventional attitude scale analysis.

    Use principal axis factor analysis with varimax rotation to maximize differential validity.

    Determine the number of factors to retain in each set . This is an area that has an art aspect. For advanced students parallel analysis would be part of the decision making.   For a solution that retains a number of factors. Items that do not load cleanly are not used. Items that do not load at all are not used.  Typically scales to represent a factor would be created only when there were at least 3 and preferably 4 or 5 items that go together and make sense as measures of some construct.

    If the set of items is well designed some items will need to be "reflected".

    Scale score are created as means of sets of items that go together.  An item is only used in one scale. Each item has the same weight - one.  Using weighted sums of items frequently failed when scales were used across subpops and studies.


    If the instrument administration was not done carefully, there may be missing data.  If a factor analysis done with listwise deletion of missing cases show that the items that did not load or did not load cleanly there may be fewer cases lost when the factor analysis is re-run without those items in the variable list.

    Likert items are used in creating Likert scales because people are fairly consistent in using the response scale.

    Some form of GLM  with the scale scores as DVs can be used to look at the difference between the groups.  The simplest would be t-tests.
    More complex models of the error should also be tried.  The write-up should look at whether the more complex models make a substantive difference in the conclusions.

    HTH


    -------------------------------------------
    Arthur Kendall
    Social Research Consultants
    -------------------------------------------








  • 7.  RE:PCA of transformed Likert-Scale data

    Posted 03-12-2013 08:23
    Thanks so much to the group. I had a long discussion with my student yesterday and showed her the various options available to her. I went through my collection of books and articles on multivariate analyses and did find several articles on correspondence analysis and multiple correspondence analysis of Likert scale questionnaire data. The following book was especially interesting:
    Greenacre, M. and J. Blasius, eds., 2006. Multiple correspondence analysis and related methods. Chapman & Hall/CRC. 581 pp.
      Articles in that book by Blasius and Greenacre (Chapter 1), Greenacre (Ch 2), Gower (Ch 3), and Nishisato (Ch 6) all deal with CA and CA-related methods for analyzing Likert-scale items. An appendix by Nenandic and Greenacre provides R code for CA and multiple correspondence analysis with Likert Scale data. I haven't tried their examples but intend to shortly. I have my own Matlab code for CA and I have used the ecological CA programs in the Vegan package.
      Most of these papers in the Blasius & Greenacre symposium volume are exploratory in nature, examining how the Likert-scale questions relate to each other and to variables such as nationality of the respondent. A few papers, like Nishisato's compare the CA approach with PCA. All of the papers involving MCA recode the data from standard subject x question form to an indicator matrix form (5 7-pt Likert item questions would be coded for each subject with a 0,1 row vector with 35 columns with a row sum of 5) or the Burt matrix form (all questions cross classified by response with 5 7-pt Likert scale questions resulting in a 35 x 35 Burt matrix). The goal of the papers is not so much hypothesis testing but the exploration of the structure of a complex questionnaire and set of responses. A PCA should exhibit much of the same pattern among questions and respondents, but the MCA might allow nonlinear structures to be revealed more readily (according to Nishisato's paper his dual scaling reveals migraines are associated with both high and low blood pressure).
       There were just too few classes surveyed with my student's questionnaires to do much with the data at the class level, even though the class is the experimental unit and the students are the measurement units. She'll take a look at what is there. There were other variables in the questionnaire that might be amenable for analysis, such as examining years of experience in chemistry with Likert-scale responses to questions assessing perception of chemical risk.
      I'm not familiar with the literature on how to deal with the cluster effect of classes. I'll read up on that. I suspect with so few classes (5), there is not much to be done in fitting a mixed model with SAS Proc mixed or similar methods in R to assess the class effect.
      I finally got a look at the questionnaire data and saw that a big part of my student's project is dealing with the coding of the responses and getting the responses in a form that can be analyzed with any method. There are many non-responses that will have to be assessed prior to doing any sort of analysis. Ecological data which result in nice sample by species matrices are much easier I think.
      Generating PCA or Factor Scores, CA scores or MCA scores and analyzing the association with variables such as years of experience in chemistry might be very informative.  That will depend on there being a simple structure that is interpretable once the responses are graphically displayed.
      Thanks much for everyone's helpful comments.
    -------------------------------------------
    Eugene Gallagher
    Associate Professor
    Dept. of Environmental, Earth & Ocean Sciences
    Univ of Massachusetts Boston
    -------------------------------------------





  • 8.  RE:PCA of transformed Likert-Scale data

    Posted 03-12-2013 14:07
    I would endorse Art's suggestions with a few notes. The psychometric literature would suggest performing either an exploratory or confirmatory factor analysis to explore the dimensionality of your responses. Since you don't have an a priori notion of the structure, EFA would be appropriate. Methodologists involved with measuring human variables (attitudes, aptitude, personality traits) generally recommend most of the following choices for the "big decisions" in EFA.

    Big decision #1: what method of estimation to use, for which either Principle Axis Factoring (PAF) or Maximum Likelihood (MLFA) should be preferred over PCA for exactly the reasons Art described. In addition, either of the first two are more likely to give you reproducible factor loading estimates, because they exclude random measurement error before estimation.

    Big decision #2: How to determine the number of factors. This is a research interest of mine. My simulation studies have suggested that the best initial guess comes from parallel analysis if possible (again spot on by Art). If you have access to SAS and are interested, I can send you a macro to do this. I also think a scree plot can be very effective. Note that the traditional eigenvalues-over-1 rule has been repeatedly shown to not work well. If you or anyone else is interested, I can send a copy of an in-press manuscript.

    Big descision #3: Factor rotation. You want to do this obtain a more interpretable solution. My one deviation from Art's recommendations is to start with an oblique rotation, which allows the factors to be correlated. After investigating the estimated factor correlations, you can then decide whether to go to the simpler orthogonal factor model given by a Varimax rotation. That way the data are helping you decide. Furthermore, human traits such as attitudes are difficult to conceive of as being completely independent of one another.

    Here are a couple good references on using EFA for item analysis.

    Gorsuch, R. L. (1997). Exploratory factor analysis: its role in item analysis. Journal of Personality Assessment, 68(3), 532-560.

    Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological methods, 4(3), 272.

    Conway & Huffcutt (2003). Organizational Research Methods, Vol. 6 No. 2, April 2003 147-168


    Some final comments. It is not usual to first transform the data as EFA doesn't really involve probabilitistic inferences like hypothesis tests but is rather used descriptively. Analysis is usually performed on the correlation matrix, so you could say you're analyzing standardized data.

    The correspondence analysis approach you refer to has not seen much attention in the psychometric literature. This would not be for lack of effort as this literature is very extensive, probably because there is so much art/finesse/subjectivity in its use. I think that approach would most closely be associated with Q factor analysis or Q-mode FA, in which the cluster analysis technique is applied to variables rather than observations. Q factor analysis has not seen a lot of attention in the last 20-30 years or so, a period in which probably many hundreds,if not thousands, of articles have been published on methods for this topic.



    -------------------------------------------
    Robert Pearson
    Assistant Professor
    University of Northern Colorado
    -------------------------------------------








  • 9.  RE:PCA of transformed Likert-Scale data

    Posted 03-12-2013 15:08
    When parallel analysis became practical I did it on a few sets of data from the 70s.  In those days we had decided on the number of factors to retain by the older methods, i.e., scree test, Montanelli & Humprey's equations, and of course divergent validity, and interpretability.  I did a scree plot of both sets of eigenvalues on a single plot. Without a formal study, it appeared that the number that was retained was where the obtained eigenvalue was about (1 + the eigenvalue from parallel analysis).

    I would be interested to know whether you are finding a similar "rule of thumb" in your simulations.

    "Furthermore, human traits such as attitudes are difficult to conceive of as being completely independent of one another."  True. But if one is going to make a distinction among constructs, when using unit weights one would not want to include an item in more than one scale.

    I do not recall a citation but even in 1971 the way I was taught, was the the Kaiser criterion (1.00)  was used for computational convenience.  Theoretically there could be as many derived dimensions as original dimensions, but why would one be interested in  a dimension that did not account for as much variance as an average item.

    Q factor analysis was often used on ipsatized data as a kind of cluster analysis where assignment to cluster were "fuzzy".  I saw it often in the 70's but IIRC I have not seen it since about 1980.

    Macros for parallel analysis are available for SPSS, SAS, and MATLAB at
    https://people.ok.ubc.ca/brioconn/nfactors/nfactors.html

    One method of parallel analysis uses random permutations of the original data. The other simulates data with the same number of variables, cases, and item  response scale.   I have not noticed meaningful differences in the outcome of these methods and would be interested in hearing whether you have noticed any differences in your simulations?

    -------------------------------------------
    Arthur Kendall
    Social Research Consultants
    -------------------------------------------








  • 10.  RE:PCA of transformed Likert-Scale data

    Posted 03-13-2013 20:41
    Interesting. I looked at PA on the full correlation matrix (PA-F) and on the reduced correlation matrix (SMCs on diagonal, PA-R). When I generated data from uncorrelated factors, PA-F was exactly right most of the time and underfactored by 1 (suggested 1 too few) pretty much otherwise. So I think that's what you observed. PA-R was exactly right more often.

    When I generated data from correlated factors and using a slightly different scheme, there was greater interaction with other experimental factors (factor correlations, variable-to-factor ratio, # of factors). Overall, PA-R was again by far the most accurate. PA-F was exactly right most of the time again, but when factors were correlated, there were very few variables per factor, and communalities were low (sort of the worst case scenario), PA-F typically underfactored by more than 1. In that case PA-R only ever underfactored by 1.

    So long story short, I guess I would affirm what you had already observed >30 years ago. Other procedures were much more variable. MAP for instance didn't fair as well. The number of factors suggested by the eigenvalue-over-1 rule pretty much just depends on the number of measured variables. Gorsuch described that in his book.

    As long as I'm going on, I might as well add one more thing. I looked at the AIC (Akaike Info Criterion computed from MLFA) as another method, for which you select the model that has the lowest value, since it measures badness-of-fit. Interestingly it was almost entirely immune to my experimental conditions; it always behaved the same. In every condition it was right about 80-90% of the time, and suggested one too many factors otherwise. It did this in scenarios that stumped all the other methods as well as those for which the other methods had near-perfect accuracy. I think that's great because as an analyst you could limit the number of models to examine for interpetability to just 2.


    Briefly on Art's other comments:
    - In practice I find a scree plot to be sufficient in many cases. When it's not clear there are usually other problems with the EFA.
    - Your comment on computing factor scale scores is important. You'd only want to use items that clearly load only on a single factor when computing the mean.
    - I think I read somewhere a colleague of Kaiser's mention that he only suggested the Little Jiffy approach at a time when mainframe computing time was valuable. He never called it the best way to decide. It is conceptually appealing to require a factor explain more variance than a single variable. In practice it just seems to often suggest too many factors. I have no problem with it as a quick-and-dirty method, but I find it disconcerting that applied researches interpret whatever the number of factors their software package suggests, which usually defaults to that rule. I wonder how many bogus factors have been "discovered" in this way.
    - I've only implemented PA using randomly generated datasets of the same dimensions as what's observed. I never looked into the random permutations approach. That's interesting. You've seem them work similarly?

    Altogether it's great to hear your first-hand account of this procedure that has been studied to death over a hundred years.



    -------------------------------------------
    Robert Pearson
    Assistant Professor
    University of Northern Colorado
    -------------------------------------------








  • 11.  RE:PCA of transformed Likert-Scale data

    Posted 03-13-2013 01:43
    Dear Eugene,

    The Mplus software and website also might be of interest to you:

    http://www.statmodel.com/

    Kind regards,
    Kevin Gray
    Cannon Gray LLC
    www.cannongray.com