ASA Connect

 View Only
Expand all | Collapse all

Multiple linear regression help

  • 1.  Multiple linear regression help

    Posted 04-25-2015 12:28
    This message has been cross posted to the following eGroups: Social Statistics Section and ASA Connect .
    -------------------------------------------
    Hello all I am doing a multiple linear regression for a final capstone project for my MPH. I am having issues with running my regression. Here is what's up: I am using SPSS My DV (which is HIV viral load) is continuous (but I also have it coded as categorical-undetectable, low and high- which I would rather use as a multinomial regression). All of my IV's are categorical and qualitative (but coded as 0,1,2) all but one is a binary category (yes-1, no-0) only one IV has 3 options (it is housing- 1-permanent housing, 2-temp housing and 3-homeless) I can do a simple regression with my DV and a single IV (which is housing, with permanent housing set as my reference category). Beyond that, every time I put in other variables it doesn't give a P value for all of my IVs (some it just shows a period) and I dont know how to interpret them-like one of my IV's is "Has history of mental health diagnosis" and it is a yes/no variable-so when it spits out a P value for the variable, how do I know if that means yes or no is the p value significance? I do have a small n=101 Any help is appreciated! Thanks everyone Silas ------------------------------ Silas Hyzer ------------------------------


  • 2.  RE: Multiple linear regression help

    Posted 04-27-2015 01:27
    Hi Check for missing values. And make sure you don't include another version of your dependent variable. Let me know if this helps. ------------------------------ Abdulaziz Farooq Statistician Aspetar-Qatar Orthopaedic and Sports Medicine Hospital ------------------------------


  • 3.  RE: Multiple linear regression help

    Posted 04-27-2015 10:12

    One of the basics of any complex analysis, especially regression models, is to do  some basic exploratory data analysis (EDA), which includes missing value analysis. EDA also includes bivariate analysis looking at the relationships between the individual variables and the target. This will often identify unusual relationships and/or missing values.

     

    One of the first steps in analysis is "Get to Know Your Data!" It will talk to you if you listen.

    ------------------------------
    Michael Mout
    MIKS
    ------------------------------




  • 4.  RE: Multiple linear regression help

    Posted 04-27-2015 10:16

    Before running your regression model, write out your regression equation.

    I.E., Y=B0 +B1*Housing etc.

     Each Bi indicates the effect of a unit increase in Xi, with all other X's held constant.

    Also, if you have 0,1,2 coding, I would set variables for 1 versus 0 and 2 versus 0.  By using the 0,1,2 coding, your model includes the assumption that a unit increase is the same from 0 to 1 as from 1 to 2.  I suggest testing that assumption.

    Hope this helps you to get started.

    ------------------------------

    Brandy Sinco
    Research Associate
    ------------------------------




  • 5.  RE: Multiple linear regression help

    Posted 04-28-2015 08:24

    Brandy,

    In general, in a multiple regression, "with all other X's held constant" is not the correct interpretation.  Each Bi summarizes the change in Y per unit change in Xi after adjusting for simultaneous linear change in the other X's in the data at hand.

    The data may support predictions in which the other X's are held constant, but that is a different purpose of regression.  When we interpret a regression coefficient, we are usually making a statement about the size of an effect.

    Many textbooks give the held-constant interpretation, but straightforward mathematics shows that that is not how multiple regression works.

    ------------------------------
    David Hoaglin
    ------------------------------




  • 6.  RE: Multiple linear regression help

    Posted 05-04-2015 14:03

    Suppose we have the following regression equation:

    Y = B0 + B1*Age + B2*Gender * B3*Income + e

    Yest(Age=A) = B0 + B1*A+ B2*Gender * B3*Income

    Yest(Age=A+1)= B0 + B1*(A+1)+ B2*Gender * B3*Income

    Yest(1 year increment in age) = B1 = Yest(Age=A+1) - Yest(Age=A).

    David, Are you saying that this is not mathematically correct?  Or am I missing something?

    ------------------------------
    Brandy Sinco
    Research Associate
    ------------------------------




  • 7.  RE: Multiple linear regression help

    Posted 04-28-2015 20:46

    Turning your three-valued predictor into two dummy variables each coded 0 or 1 is the better approach because it avoids making the assumption of equal distance between the categories. David Greenberg,Sociology Department, New York University


    ------------------------------
    David Greenberg
    Professor
    New York University
    ------------------------------




  • 8.  RE: Multiple linear regression help

    Posted 04-29-2015 09:36
    David H. said it as it should be.

    ------------------------------
    Raid Amin
    Professor
    University of West Florida
    ------------------------------




  • 9.  RE: Multiple linear regression help

    Posted 05-01-2015 13:56
    I agree that you need to do exploratory analysis, and then also I would suggest a stepwise model which would give you more information.
    Sometimes you loose information by coding a continuous variable. Remember also that multiple regression assumes linear relationships., and no interaction between you independent variables. You can test those assumptions and probably should.
    Pat Fox
     
     
     





  • 10.  RE: Multiple linear regression help

    Posted 05-04-2015 10:02
    Silas, if you are using ordinal data to describe your independent variables, you may want to express these as dummy variables and do generalized linear modeling, not the standard least squares. 

    ------------------------------
    Aubrey Magoun
    Consultant
    Applied Research & Analysis, Inc.
    ------------------------------




  • 11.  RE: Multiple linear regression help

    Posted 05-04-2015 10:39

    Silas,

     There is no need to code your housing variable using indicator variables yourself, simply enter it as a categorical variable (factor) rather than as a covariate and SPSS will take care of the coding for you.  In fact that is probably what is wrong with your model.  If you are entering all the binary-valued variables as factors, SPSS will generate a fully factorial model including interaction terms for all the variables.  This model will easily exhaust the degrees of freedom with  only 101 cases.  Specify a custom model using main effects only, and you should be able to proceed.  Alternatively, specify the binary coded variables as covariates instead of factors (be sure to use 0-1 coding in that case).

    ------------------------------
    John Bauer
    ------------------------------




  • 12.  RE: Multiple linear regression help

    Posted 05-06-2015 22:01

    Silas, I wanted to comment on your DV, which is HIV viral load, and which you indicate is continuous but has a lower limit of detection. Because your DV has these characteristics, you may wish to look into something called Tobit Regression, which was originally invented by an economist named Tobin, but which has since been used by the biomedical community to analyze serum biomarker data including viral-load data in both the HIV and Hepatitis-C contexts, where viral loads are frequently below the limit of detection.

    Does SPSS do Tobit Regression? Apparently not by itself, but add-ons apparently do exist. Via Google-search, I found the following at a Linked-In website, posted by someone named Jon Peck writing under SPSS Power Users:

    "Tobit regression is available using the SPSSINC TOBIT REGR extension command. That requires the R Essentials. They can be downloaded from the SPSS Community website at www.ibm.com/developerworks/spssdevcentral. Once installed (be sure to read the installation instructions and prerequisites), you will have a dialog box for tobit regression under Analyze > Regression.

    "You might also be interested in the Heckman generalization of tobit regression. That can be done with the STATS HECKMAN REGR extension command available from the same location." 

    ------------------------------
    Eric Siegel
    Biostatistician
    Univ of Arkansas for Medical Sciences of Biostatistics
    ------------------------------