Another q for the stats people!
People worry about collinearity (cf blog post below).
Consider a scenario in which the collinear predictors are just controls to account for confounding.
Including both of them doesn't impair the precision with which the effect of interest is estimated, does it?
People worry about collinearity (cf blog post below).
Consider a scenario in which the collinear predictors are just controls to account for confounding.
Including both of them doesn't impair the precision with which the effect of interest is estimated, does it?
Comments
Fit Y ~ b1*X1 +b2*X2 + b3*X2, and we care about the SE(b1). [Y, X1:X3 are all centered and scaled] We assume that Cor(X2, X3) is "large", but Cor(X1, X2) and Cor(X1,X2) is "small"
X'X = [A B]
[B'C]
(A=1, B = [cor(X1,X2), X1,X3], C = Cor(X2:X3))
SE(b) = (X'X)^-1[1,1], or the first diagonal element.
2/n
Inverse of a block matrix can be inverted blockwise (https://en.wikipedia.org/wiki/Block_matrix#Inversion) and the top left element of that inverse will have a value of:
(A - BC^-1B')^-1
3/n
the matrix should be:
M = n*[A B]
[B' C]
and
SE(b1)^2 = (SE(res)^2*M^{-1})[1,1]
(inverse of M times residual standard error)
so SE(b1)^2 = SE(res)^2*1/n(1-BC^{-1}B')^-1)
But the same argument still holds
But let's simulate!
https://x.com/jmwooldridge/status/1483493723233259527
"Proof" by example:
What is that about if collinearity only affects the SEs?
https://link.springer.com/article/10.3758/s13428-015-0624-x
https://journals.sagepub.com/doi/abs/10.1177/0013164418817801
They would argue that centering can help in model selection in case when we include interaction terms.
But in DoE traditionally *all* variables are treatment variables.
https://www.science.org/doi/10.1126/science.adi6000
1/3
For example, for linear dependence, you could fit linear regression models
X_j ~ X_1 + … + X_n
(with X_j omitted from RHS)
and check the size of Rsq.
How on earth would you check correlation?
P.S.: Are we talking about correlation between X_j and the linear combination of all other predictors? How is “correlation” defined here?
2/3
3/3