Discussion: View Thread

  • 1.  Reversed Factors?

    Posted 12-27-2014 10:29
    Hi All,

    I have a pretty simple question about PROC LOGISTIC that I just can not figure out. (Maybe its the holiday season but I think my brain stopped working two weeks ago.)

    I am doing a comparison of graduation rates between students with no paying job (coded 0), one paying job (coded 1) and multiple paying jobs (coded 2).  The variable is the NCES BPS variable "NumJobs09".  I wanted to compare those with multiple jobs to those with one job in a CONTRAST statement because the model statement should give me comparisons between the "no job" people and the two other groups.


    Below is the statement used (I have some other comparisons but are not of interest here, so I'll ignore them):

    proc logistic data = logistic_data;
        class var1 (ref='1') var2 (ref = '0') var3 (ref = '0') var4 (ref = '0')
        numjob09 (ref = '0') / param = glm;
        model bs (event='1')= var1 var2 var3 var4 var5 var6 numjob09 var8 var9 var10 / clodds=wald lackfit rsq;
        contrast 'One Job vs. Multiple Jobs' numjob09 0 1 -1 / estimate;
        format numjob09 j.;
        ....;
    run;

    I have a format that I put on it too:

    proc format numjob09 j.,
    value j     0 = 'No Job'
                  1 = 'One Job'
                 2 = 'More than one Job'
                 ;
    run;

    But, when I checked the class level information, it shows:

    NUMJOB09

    More than one Job

    1

    0

    0

     

    No Job

    0

    1

    0

     

    One Job

    0

    0

    1



    Also, the MLE estimates (with degrees of freedom listed in right column) suggest that the "One Job" group is the base:
                  

    NUMJOB09

    More than one Job

    1

    NUMJOB09

    No Job

    1

    NUMJOB09

    One Job

    0



    First question:  Does the CONTRAST statement (which identifies the one job group as the base for the comparison with the multiple jobs group) override whatever reference you put on the model statement, specifically numjob (ref=0)?  Is the a consequence of the param=glm option?
     
    The odds ratio section (with point estimates) seems to confirm this:

    NUMJOB09 More than one Job vs One Job

    2.063

    NUMJOB09 No Job vs One Job

    1.308


    The contrast test results seem to ignore the no job section (as requested):

    Contrast Test Results

    Contrast

    DF

    Wald
    Chi-Square

    Pr > ChiSq

    One Job vs. Multiple Jobs

    1

    29.1577

    <.0001




    Second question: Since I have odds ratios for 2 vs. 1  and 0 vs. 1, can I just compare these two to get 2 vs. 0  (2v1 / 0v1) ?

    Third question:  Can someone please explain why the class level information lists 2-0-1 rather than 2-1-0?

    Like I said, it's the holidays and I really should be enjoying my vacation, but this one thing is driving me insane.

    Any help you can give would be greatly appreciated. 

    Thanks in advance,
    Ray


    -------------------------------------------
    Raymond Mooring
    Senior Statistical Consultant
    Analysis Made Easy
    -------------------------------------------


  • 2.  RE: Reversed Factors?

    Posted 12-27-2014 10:52
    First, note that SAS is using the formatted order, which you probably don't want. Use

    PROC LOGISTIC ORDER=INTERNAL;

    to get SAS to use the internal ordering, which is almost always what people want to do. Second, most of your questions can be addressed by properly structured CONTRAST statements. Your reference category (EVENT='1') is the one job, which won't be changed by the ORDER statement. Even if the MODEL statement gives you specific comparisions, you can still do them as CONTRAST statements - I probably would simply to ensure that my intuition was correct. Finally, don't overlook the ESTIMATE statement.

    -------------------------------------------
    Paul Thompson
    Director, Methodology and Data Analysis Center
    Sanford Research/USD
    -------------------------------------------




  • 3.  RE: Reversed Factors?

    Posted 12-27-2014 11:21
    Ray, I have a guess. Try taking the format off and see what that does. It appears as if the parameters are being estimated in alpha order according to the assigned format. ------------------------------------------- David Mangen Owner Mangen Research Associates, Inc. -------------------------------------------


  • 4.  RE:Reversed Factors?

    Posted 12-27-2014 11:32
    Thanks for all the details; it definitely helps understand the issue.

    If it were 0,1,2 then you'd be good to go since SAS defaults that way for numeric values. But where you formatted them, it defaults as alphabetic: hence it is reading them in as M, N, O.

    I can think of two solutions offhand.
    First one is easiest: just modify the formats to be "0 - No job", "1 - ...", and "2 - ..." That tricks SAS into using them in the desired order. That would be my preferred approach as it will apply universally to all procs so long as the variable is formatted in a data step. This issue pops up with a lot of ordinal assessments so this is how I format it to remind myself of the categories since for some low is better but others high is better.

    The second approach takes a bit of tweaking to the logistic code. Exact syntax escapes me but I think it is something like order=formatted. That may only work in freq but otherwise if you use a class statement I believe you can tweak it with param='ref' ref='no job'. If you search re SAS online help the syntax will be clear.

    Either approach should convince SAS to use the no job category as your reference and not require additional changes to your code or contrasts.
    Good luck!
    Scott Miller

    -------------------------------------------
    Scott Miller
    Biostatistician
    Clinipace Worldwide
    -------------------------------------------





  • 5.  RE: Reversed Factors?

    Posted 12-27-2014 12:23
    The issues involving which values correspond to the "success" are common to all the statistical packages I have used.  I have used the small examples from Agresti's Categorical Data Analysis text, where the model is explicitly defined in terms of success and parametrization (some other authors love to use negatives, for instance), in order to diagnose and understand exactly what the heck SAS is trying to interpret as the "1."

    Then you can crush it into submission with the appropriate descending and/or data pre-sorting statements, if results are backward.
    -Mark

    -------------------------------------------
    Mark Lancaster
    Northern Kentucky University
    -------------------------------------------




  • 6.  RE: Reversed Factors?

    Posted 12-27-2014 15:50
    Thank you ... thank you .... thank you all.  Now, I can enjoy the rest of my "vacation".  I absolutely love this e-group.  You guys are awesome!

    Happy Holidays!


    -------------------------------------------
    Raymond Mooring
    Senior Statistical Consultant
    Analysis Made Easy
    -------------------------------------------