ASA Connect

 View Only
  • 1.  Analyzing questionnaire data

    Posted 03-31-2015 05:38

    Dear ASA members,

     I am puzzled by the topic of analyzing questionnaire data. Suppose I have a large questionnaire, consisting of groups of questions. One of these groups might consist of questions trying to solve the participants agreement (or disagreement) on how well the state takes care of environmental issues. The answers are collected using an ordered 5-scale Likert scale (1 = total agreement,...,5 = total disagreement). My ultimate goal is to model one of these groups (a variable constructed from the individual answers) using a bunch of socio-demographic and some other explanatory variables.

     My understanding is that for example in social sciences people many times calculate Cohen's Kappa to check the inter-agreement. The next step is to combine individual answers within a question-group (for example state and environmental issues). People seem to be fond of adding the individual values, or taking the average. This new variable (based on sum or average) is then treated as a continuous variable. Here is where I am most puzzled. That is, the original variables (individual answers) are ordinal variables, but after some transformation (summing up, average etc.) the new variable is treated as a continuous variable.

     Have any of you worked on or researched this sort of problem? I am currently reading a book by Alan Agresti called 'Analysis of Ordinal Categorical Data', which deals with ordered rather than nominal responses. I am not yet sure whether this book gives me all the answers to this puzzling problem. I have only glanced through some social sciences -books, but those seemed to side-step this potential problem moving from ordinal scale to interval. They usually just present a method of combining the individual answers and simply move on.

     I would very much appreciate any responses or suggestions. Especially, if someone knows a book or a paper where this issue is explicitly tackled, I would be more than happy.

     Best wishes,

    Eero Liski

    ------------------------------
    Eero Liski

    University of Tampere

    Finland
    ------------------------------



  • 2.  RE: Analyzing questionnaire data

    Posted 04-01-2015 10:55
    The procedure you describe implicitly transforms each original ordinal variable from 1 to 5 into an interval variable from 1 to 5, and then proceeds to add or average these interval variables and appropriately interpret the result as an interval variable.  You question amounts to asking whether the transformation is appropriate.  A suggestion for transformation of ordinal to interval variables is given in Abelson, R. P., and Tukey, J. W., Efficient utilization of non-numerical information in quantitative analysis:  General theory and the case of simple order, Annals of Mathematical Statistics, 34(4), 1347-1369, 1963.
     
    ------------------------------
    James Schmeidler
    Icahn School of Medicine at Mount Sinai
    ------------------------------




  • 3.  RE: Analyzing questionnaire data

    Posted 04-01-2015 11:37

    I wouldn't get overly concerned about whether the individualdistributions are continuous, but rather whether composite variables have reasonablycontinuous distributions.  Normaldistributions might be nice from a mathematical point of view, but theappropriate form of the distributions of the composite variables (e.g., simple sumsof related items) depends on the circumstances.  Some steps in analyzing a questionnaireinclude the following:

    1.      Assess whether the univariate distributions ofthe questions make sense, e.g., whether certain extremes very rare.

    2.      Perform variable clustering to assess whetherthe correlations of the questions make sense.

    3.      For each set of variables defining a compositevariable, compute the item-total correlations. (For each item-total correlation, the item should not be included in thetotal.)  If a particular item-totalcorrelation is weak, then there may be a cultural problem with the wording ofthe item (question) for the particular subjects who have filled out thequestionnaire.

    4.      Identify any multivariate outliers to assesswhether the questionnaire seems inappropriate for any of the subjects.

    5.      Perform factor analysis with obliquerotation.  Compute the correlations ofthe factor scores with the composite variables. If the correlations are strong, stay with the simple definitions of thecomposite variables.  If the correlationsare weak, then it might make sense to weight the items unequally when computingcomposite scores.






  • 4.  RE: Analyzing questionnaire data

    Posted 04-01-2015 13:22

    Dear Eero,

    You might want to check out something called item response theory that people in psychology and educational measurement use to help interpret responses to Likert scale items, among others. You might be most interested in what are called polytomous IRT models, such as Samejima's graded response model and Andrich's rating scale model. You will see definite similarities with the models that Agresti describes. There's lots of stuff about IRT on the web. One good hard-copy source is Susan Embretson and Steven Reise's book, "Item response theory for psychologists," published by Psychology Press in 2000.

    Have fun!

    Charlie

    ------------------------------
    Charles Lewis
    Educational Testing Service
    ------------------------------




  • 5.  RE: Analyzing questionnaire data

    Posted 04-02-2015 02:37
    Dear all,

    Thank you greatly for your responses. I appreciate the references and I will look into those.

    I might be horribly wrong, but I understood that for example the Cumulative Logit modeling approach is for strictly ordinal responses. That is, for example for a 5-scale ordinal response its values are discrete (1,2,3,4,5). However, when one combines answers (which are assumed to measure the same thing) by for example averaging, then naturally this new (averaged) variable has interval scale between 1 and 5. I understand that at least Cumulative Logit is no longer valid. But as I said, I will look into the suggestions you all made.

    This is an interesting topic - something to have fun with for sure!

    ------------------------------
    Eero Liski

    Doctoral Student

    University of Tampere
    ------------------------------