ASA Connect

View Only

Back to eGroups

Expand all | Collapse all

Regression coefficients vs causal coefficients

1. Regression coefficients vs causal coefficients

0 Recommend
Igor Mandel
Posted 07-19-2017 14:25
I came across the following problem. Let assume, I have one dependent variable Y and several independent variables X supposedly affecting this variable. All these X are exogenous in a sense, that I don't know anything which affects them. Now I want to estimate the relationship between Y and X (assuming everything is linear, etc.) and use just regular linear regression techniques. On the other hand, one can use SEM software or other approaches (like Pearl's DAG approach) and obtain so called "causal coefficients".

My question - would these coefficients be equal to regression ones in this situation or not?
Many thanks for any comments.

Igor Mandel, VP, Telmar
2. RE: Regression coefficients vs causal coefficients

0 Recommend
Landon Hurley
Posted 07-19-2017 15:00
This manuscript by Bollen and Pearl may aid your understanding; myth #2 on page 13 specifically addresses the conceptual differences. Short answer would be that the two parametrisations can be constructed to provide conditionally exchangeable distributions, but whether the restrictions imposed under simultaneous regularity conditions are what is truly desired about the system is another matter entirely.

@incollection{bollen2013eight,
title={Eight myths about causality and structural equation models},
author={Bollen, Kenneth A and Pearl, Judea},
booktitle={Handbook of causal analysis for social research},
pages={301--328},
year={2013},
publisher={Springer}
URL={http://ftp.cs.ucla.edu/pub/stat_ser/r393.pdf}
}

------------------------------
Landon Hurley
------------------------------

Original Message
3. RE: Regression coefficients vs causal coefficients

0 Recommend
Igor Mandel
Posted 07-20-2017 09:12
Many thanks, London. However, the conceptual difference is one thing (I may say a lot about it, especially in J. Pearl's version of causality, which has too many ill formulated problems http://ssrn.com/abstract=2984045), but the values of "causal" and regression coefficients in this simple situation (which I'm interested in) is quite another. Are they different or not?

As for the paper you refer to - yes, I familiar with it and even published (with a colleague) a critical review of this Handbook in Technometrics ((2015) Book Reviews, Technometrics, 57:2, 292-300, DOI: 10.1080/00401706.2015.1052714).

The only reason I put this question for discussion is that when I tried to put it, among several others, into "Causality blog" http://causality.cs.ucla.edu/blog/ (where its originator J. Pearl regularly invites to put questions, etc.), it was rejected exactly for the reason, that I don't understand the difference between causal and regression coefficients. Well, I plainly admit the fact that they may be different in complex DAG (and never denied it), but still, struggle to understand, what is the difference when DAG has its most primitive shape, Y as a function of several X without any strings attached to X.

You mentioned two parametrizations: which are they? Are they possible even in this situation, when all X are exogenous?

Many thanks again - Igor

Original Message
4. RE: Regression coefficients vs causal coefficients

0 Recommend
Hrishikesh Vinod
Posted 07-20-2017 10:08
I sympathize with Igor. I have an R package to implement causality of the type Igor wants. However
nonlinearity seems to be needed for determining that a variable is exogenous.

Causal Paths from data and new exogeneity tests in {generalCorr} Package for Air Pollution and Monetary Policy

by H. D. Vinod, Ph.D., Fordham University, NY, June 7, 2017.

Since causal paths from data are important for all sciences, my R package `generalCorr' provides sophisticated functions. The idea is simply that if X causes Y (path: X->Y) then non-deterministic variation in X is more "original or independent" than similar variation in Y. We compare two flipped kernel regressions: X=f(Y, Z) and Y=g(X,Z), where Z are control variables. Our first two criteria compare absolute gradients (Cr1) and absolute residuals (Cr2), both quantified by stochastic dominance of four orders (SD1 to SD4). Our third criterion (Cr3) expects X to be better able to predict Y than vice versa using generalized partial correlation coefficients r*(X,Y|Z). These methods allow us to create a replacement for the Hausman-Wu medieval-style diagnosis of endogeneity relying on showing that a dubious cure (instrumental variables) works. The ultimate causal path: X->Y depends on a weighted sum (strength) of all three criteria. Bootstrap inference is also available.

A new version of the package makes it possible to get evidence on causal paths with few lines of code. Details and examples are at: Vinod, H. D., "Causal Paths and Exogeneity Tests in {generalCorr} Package for Air Pollution and Monetary Policy" (June 6, 2017). Available at SSRN:

https://ssrn.com/abstract=2982128

[code lang="r"]

install.packages("generalCorr"); require(generalCorr); options(np.messages=FALSE); causeSummary(airquality)

[/code]

     cause     response strength corr.     p-value

[1,] "Solar.R" "Ozone" "100"   "0.3483" "0.00018"

[2,] "Wind"   "Ozone" "31.496" "-0.6015" "0"

[3,] "Temp"   "Ozone" "100"   "0.6984" "0"

[4,] "Month"   "Ozone" "31.496" "0.1645" "0.0776"

[5,] "Day"     "Ozone" "31.496" "-0.0132" "0.88794"

------------------------------
Hrishikesh Vinod
------------------------------

Original Message
5. RE: Regression coefficients vs causal coefficients

1 Recommend
Igor Mandel
Posted 07-21-2017 11:34
Many thanks, Hrishikesh, I looked at your article - very interesting, I'll think more about it.

But just for clarification: it seems that your problem (and your proposed solution of it) is - how to say that X variables are exogenous, i.e. not under control of some other variables. And this is very important to know, of course.

But my problem was, in fact, simpler:

if I know that X variables are exogenous (or ignore the possibilities that they are not), and
if I know that relation between X and Y is linear, and
if I don't care too much about how to residuals are distributed (normal, or not normal), and
if there is only one Y and several X -

would be any difference between usual linear regression coefficients and any form of "causal" or "path" coefficients?

It looks like Brandy replied positively to that question - many thanks, Brandy!

Thanks to all again for consideration - Igor

Original Message
6. RE: Regression coefficients vs causal coefficients

0 Recommend
Brandy Sinco
Posted 07-20-2017 11:29
Hi Igor,

Do you have a single Y outcome variable or several Y's (Y1, Y2, ..., Ym)?

If I had a single Y outcome variable, I would use linear regression.
While the Beta's would be the same in path analysis and in linear regression,
the variances would be unbiased with linear regression.

In path analysis, the variances are asymptotically unbiased, but biased for smaller samples.

------------------------------
Brandy Sinco, BS, MA, MS
Research Associate
------------------------------

Original Message
7. RE: Regression coefficients vs causal coefficients

0 Recommend
Howard Wainer
Posted 07-24-2017 10:53
I thank all participants of this discussion for their (alas, futile) efforts to relieve my confusion in this area.
I understand Rubin's approach to measuring the size of causal effects (see Holland, JASA 1986), but various regression methods to do the same thing seem to shift from casual inference to causal inference (a simple vowel movement) with no justification than hope.
I will continue to try to understand, but it has been years -- I fear I am hopeless.

------------------------------
Howard Wainer
Extinguished Research Scientist
------------------------------

Original Message
8. RE: Regression coefficients vs causal coefficients

0 Recommend
Igor Mandel
Posted 07-25-2017 08:42
Howard, I envy you - it seems, you accept Rubin's (counterfactual) approach as "casual" while considering regression (not counterfactual) as "causal". If any of it were causal and casual without the brackets! Alas... I collected a lot of arguments against the counterfactual theory in any (Rubin's or Pearl's or others versions) here - http://ssrn.com/abstract=2984045; maybe, it would promote the point that the real causal theory in statistics does not yet exist and, most likely, will never be created.

------------------------------
Igor Mandel
Telmar, Inc.
------------------------------

Original Message
9. RE: Regression coefficients vs causal coefficients

0 Recommend
James Little
Posted 07-26-2017 18:13
I tried to click on the link to Igor Mandel's collection of arguments against counterfactual theory but got "page cannot be found" message. Any alternative way to get to collection?

thanks

Kevin Little, Ph.D.
Informing Ecological Design, LLC
2213 West Lawn Avenue
Madison, WI 53711
tel 608.251.4355 fax 888.247.7543
email klittle@iecodesign.com
http://www.iecodesign.com

Original Message
10. RE: Regression coefficients vs causal coefficients

0 Recommend
Peter Kenny
Posted 07-26-2017 19:20
Kevin

A little experimentation shows that the cited address is wrongly written. It should read:
http://ssrn.com/abstract_id=2984045.
I found this by putting the number 2984045 into the search box on the SSRN site.

Hope this helps

Peter Kenny

Original Message
11. RE: Regression coefficients vs causal coefficients

0 Recommend
Igor Mandel
Posted 07-27-2017 08:48
Many thanks, Kevin! I don't know the reason - on my machine, it opens the page with this address (just checked again!), but I really appreciate your efforts and experimental evidence that the address is http://ssrn.com/abstract_id=2984045
Igor

------------------------------
Igor Mandel
Telmar, Inc.
------------------------------

Original Message
12. RE: Regression coefficients vs causal coefficients

1 Recommend
Igor Mandel
Posted 07-28-2017 09:15
Peter, I just realized that YOU found the way to made a correct link, answering to Kevin - many thanks for that and sorry for my confusion with previous posts.
Igor

------------------------------
Igor Mandel
Telmar, Inc.
------------------------------

Original Message
13. RE: Regression coefficients vs causal coefficients

0 Recommend
Steven Curtis
Posted 07-25-2017 21:00
Hi Igor,

I'll take a stab at this. Just to be clear, Pearl's approach to causality isn't about *how* to estimate causal effects. It's about the *ability* to estimate causal effects -- identifiability -- from observational data. In other words, Pearl's theorems will tell you whether it's possible to estimate the causal effects---as defined by the DAG---from observational data, but they don't tell you how to estimate them.

Now to your question... To restate the setup of your problem in light of the above, you have a response variable Y and several X variables X1, ..., Xp that each has a direct causal effect on Y and do not have any causal effects on each other. A DAG representing this scenario would have an arrow from each X variable pointing into the Y variable and no other arrows. Additionally, you are comfortable assuming that these causal effects are linear.

According to Pearl's theorems (actually, the linear case goes back to Wright, I think), the causal effects in the DAG are identifiable and can be estimated from the observational data.

Now that Pearl's theorem's have established the identifiability of the causal effects, the question then becomes how should we estimate the parameters of Pr(Y | X1, ..., Xp)? We are OK with linearity, so OLS estimation will do nicely. However, we may also use estimation methods from the SEM literature---full-information methods or simultaneous methods (see, for example, Chap 11 of Kline's "Principles and Practice of Structural..." 4th ed). These methods were developed primarily to estimate parameters from several equations simultaneously, but there is no reason why we can't use them to estimate parameters from a single equation. In fact, the estimates are equivalent to the OLS estimates in this case. (I think there might be some small differences in degrees of freedom between full-information methods and OLS, but I think the estimates themselves are mathematically equivalent.)

Below is some R code that demonstrates this (with only slight differences in due to numerical estimation):

> library(sem)
> n <- 100000
> X1 <- rnorm(n)
> X2 <- rnorm(n)
> X3 <- rnorm(n)
> Y <- 3*X1 + 2*X2 - 1*X3 + rnorm(n)
> lmout <- lm(Y ~ X1 + X2 + X3 + 0)
>
> mod <- specifyEquations(
+ text="Y = beta1*X1 + beta2*X2 + beta3*X3"
+ , covs=c("Y", "X1", "X2", "X3"))
Read 1 item
> semout <- sem(mod, data=data.frame(Y=Y, X1=X1, X2=X2, X3=X3))
>
> coef(lmout)
X1 X2 X3
2.9974998 2.0015475 -0.9978271
> coef(semout)[1:3]
beta1 beta2 beta3
2.9975046 2.0015477 -0.9978332
> library(sem)
> n <- 100000
> X1 <- rnorm(n)
> X2 <- rnorm(n)
> X3 <- rnorm(n)
> Y <- 3*X1 + 2*X2 - 1*X3 + rnorm(n)
>
> lmout <- lm(Y ~ X1 + X2 + X3 + 0)
>
> mod <- specifyEquations(
+ text="Y = beta1*X1 + beta2*X2 + beta3*X3"
+ , covs=c("Y", "X1", "X2", "X3"))
Read 1 item
> semout <- sem(mod, data=data.frame(Y=Y, X1=X1, X2=X2, X3=X3))
>
> coef(lmout)
X1 X2 X3
2.9955082 1.9997821 -0.9982949
> coef(semout)[1:3]
beta1 beta2 beta3
2.9955087 1.9997834 -0.9982945

Hopefully, that was helpful.

Best
McKay

------------------------------
Steven Curtis
Principal, Decision Science
The Walt Disney Company
------------------------------

Original Message
14. RE: Regression coefficients vs causal coefficients

0 Recommend
Igor Mandel
Posted 07-26-2017 11:31
Hi Steven,

Many thanks for your detailed answer and the program, really appreciate it. Yes, your conclusive part coincides with mine - they should be equal in this situation. I repeat myself, but let just state again - the reason I put this question was that on Pearl's causality blog I got the answer that they are not equal, what surprised me and lead to this thread. However, on a way, many new interesting things were touched, and your answer is a very good illustration to that.
Especially your thesis, that Pearl just tests the identifiability but not gives the estimates. I agree that many of theorems are about that, but in a specific sense: does the path allow to say something about "causes" or not. In that aspect, I have no problem with DAG theory at all. To answer that, you have to have a real DAG first. But when it comes to the simplest case like what I described, the theory says "you can", but regression says "you cannot" - in a sense, that regression coefficients themselves have nothing to do with causality (they may or may be of causal nature). Don't you feel the troublesome problem here?
On the other hand, J. Pearl does provide the estimates of causal effects, at least in certain special form. All his do-operators intend to do just that - he makes different (numerical) conclusions and so on, even when regression is not involved. A different question - how all that is "casual". I consider it in details in Troublesome Dependency Modeling: Causality, Inference, Statistical Learning by Igor <g class="gr_ gr_2359 gr-alert gr_gramm gr_inline_cards gr_run_anim Style multiReplace" id="2359" data-gr-id="2359">Mandel :</g>: SSRN and show, that it is a special form of "imaginary indicators", similar to the one used in index numbers for more than a hundred years. It is not "causal" in a scientific "based on facts" meaning.

Once again - thank you very much for your thorough answer.

Igor

------------------------------
Igor Mandel
Telmar, Inc.
------------------------------

Original Message

ASA Connect

Regression coefficients vs causal coefficients

Igor Mandel07-19-2017 14:25

Landon Hurley07-19-2017 15:00

Igor Mandel07-20-2017 09:12

Hrishikesh Vinod07-20-2017 10:08

Igor Mandel07-21-2017 11:34

Brandy Sinco07-20-2017 11:29

Howard Wainer07-24-2017 10:53

Igor Mandel07-25-2017 08:42

James Little07-26-2017 18:13

Peter Kenny07-26-2017 19:20

Igor Mandel07-27-2017 08:48

Igor Mandel07-28-2017 09:15

Steven Curtis07-25-2017 21:00

Igor Mandel07-26-2017 11:31

1. Regression coefficients vs causal coefficients

2. RE: Regression coefficients vs causal coefficients

3. RE: Regression coefficients vs causal coefficients

4. RE: Regression coefficients vs causal coefficients

5. RE: Regression coefficients vs causal coefficients

6. RE: Regression coefficients vs causal coefficients

7. RE: Regression coefficients vs causal coefficients

8. RE: Regression coefficients vs causal coefficients

9. RE: Regression coefficients vs causal coefficients

10. RE: Regression coefficients vs causal coefficients

11. RE: Regression coefficients vs causal coefficients

12. RE: Regression coefficients vs causal coefficients

13. RE: Regression coefficients vs causal coefficients

14. RE: Regression coefficients vs causal coefficients

Contact Us

Membership

Privacy

Follow Us

ASA Connect

Regression coefficients vs causal coefficients

Igor Mandel07-19-2017 14:25

Landon Hurley07-19-2017 15:00

Igor Mandel07-20-2017 09:12

Hrishikesh Vinod07-20-2017 10:08

Igor Mandel07-21-2017 11:34

Brandy Sinco07-20-2017 11:29

Howard Wainer07-24-2017 10:53

Igor Mandel07-25-2017 08:42

James Little07-26-2017 18:13

Peter Kenny07-26-2017 19:20

Igor Mandel07-27-2017 08:48

Igor Mandel07-28-2017 09:15

Steven Curtis07-25-2017 21:00

Igor Mandel07-26-2017 11:31

1. Regression coefficients vs causal coefficients

2. RE: Regression coefficients vs causal coefficients

3. RE: Regression coefficients vs causal coefficients

4. RE: Regression coefficients vs causal coefficients

5. RE: Regression coefficients vs causal coefficients

6. RE: Regression coefficients vs causal coefficients

7. RE: Regression coefficients vs causal coefficients

8. RE: Regression coefficients vs causal coefficients

9. RE: Regression coefficients vs causal coefficients

10. RE: Regression coefficients vs causal coefficients

11. RE: Regression coefficients vs causal coefficients

12. RE: Regression coefficients vs causal coefficients

13. RE: Regression coefficients vs causal coefficients

14. RE: Regression coefficients vs causal coefficients

Related Content

Causal Paths in from observational data and endogeneity testing

Diet and Diseases: Causal assessment using stats tools

Coefficient of Determination

Causal inference using Regression Discontinuity Design, RDD

Does Randomization infer Causality?

Contact Us

Membership

Privacy

Follow Us