In the existing lit, the missingness is not on either side of the equation, because there is no equation, i.e. no predictive modeling. It simply assumes one has data, and it is up to the users what they want to do with the data after NAs have been dealt with, including predictive modeling. 🧵 1/ - ThreadSky

matloff.bsky.social • 19 hours ago

In the existing lit, the missingness is not on either side of the equation, because there is no equation, i.e. no predictive modeling. It simply assumes one has data, and it is up to the users what they want to do with the data after NAs have been dealt with, including predictive modeling. 🧵 1/

Comments

lucystats.bsky.social•19 hours ago

No I mean the missingness is in Y (or the variable you are calculating stats or conditional means on) not X (the covariates). In Epi, lots of times the missingness is not in the outcome but the covariates.

lucystats.bsky.social•19 hours ago

It matters because if the missingness is in X and you want E[Y|X] there are lots of times it doesn’t matter, even if MNAR.

matloff.bsky.social•19 hours ago

Doesn't matter, the methods in the lit are neutral on that.

lucystats.bsky.social•19 hours ago

Not so! If X is missing and all I care about is Y|X it definitely matters. Many times complete case analyses will give me unbiased results even when X are missing not at random — I don’t think this is well understood!

lucystats.bsky.social•19 hours ago

(I’m not talking about prediction I am talking about understanding the relationship between Y and something, a coefficient, causal effect etc)

matloff.bsky.social•19 hours ago

Right, I used the term "effect assessment." The toweranNA package won't help you there.

matloff.bsky.social•19 hours ago

I think we're talking past each other. :-) My phrase "doesn't matter" wasn't meant in your context here.

BTW, what specifically do you mean by "unbiased"? I'm not aware of any method to check this.

lucystats.bsky.social•19 hours ago

😅 sorry!

By unbiased I just mean what you’d get with the full data (no missingness in X) is the same (in expectation) as what you’d get if you estimate it using partially observed X

matloff.bsky.social•18 hours ago

I see. Then how would one check that, especially in the regression case?

matloff.bsky.social•19 hours ago

AFAIK, the only missingness method specifically designed for prediction is our toweranNA package, in CRAN.

But again, the standard is to apply missingness methods to your data first, no "equation," then fit your model. My students and I are developing a package to facilitate this. 2/

matloff.bsky.social•19 hours ago

I definitely recommend multiple imputation methods, as most missingness methods have high variance. Of course, they also all have bias, but there is not much one can do about that, given unverifiable assumptions. 3/

matloff.bsky.social•19 hours ago

If you are really doing prediction, as opposed to effect assessment, I do suggest toweranNA. As I said, it's specifically for prediction, with the bonus that its assumptions are verifiable. 4/4

Comments

Posting Rules

Reply