Looking for a longitudinal (ideally intensive) dataset where missing data is expected to be problematic, i.e. MNAR, anyone have pointers to good examples?
Comments
Log in with your Bluesky account to leave a comment
Not intensive, but could think about publicly available panel data such as Pairfam (14 annual waves)? There’s missing data for all the typical reasons/attrition over time but also because not all constructs were assessed at every wave (some every 2 years, for example).
Thanks - I'm vaguely hoping for something like a depression / mood score, where people skip that time point because they are lower in mood, or a 'rate your bosses performance' score which they skip (if they feel negatively) for fear of reprisal, etc.
I think there could be something like that in there, but not sure. Also some people who skipped a whole wave and rejoined later, but more common for people to drop out entirely.
Both larger t and smaller n, but two ESM datasets came to mind that could be relevant: One on alcohol consumption (https://osf.io/q7upd/) or one on PTSD symptoms (https://osf.io/6vxhf/)
It’s intensive longitudinal data where where the outcome is a binary self-report question on binge eating. There’s 50% missingness and a suspected MNAR process where people don’t respond to the binge eating question when they binge eat
Thanks I totally forgot you'd already done this idea, has been at the back of my mind for too long aha. Maybe I'll just replicate it in the non-Bayesian continuous time context, as I really just want it as a nice example of the usefulness of binary data in dynamic systems...
Yes, I think you'd have use Bayesian methods in Mplus. I also don't think you could do a continuous time version in Mplus because I don't think that they've added support for binary variables in continuous time (although I might be behind on what is supported!)
Comments
I’ve been noodling on how to deconfound that but it’s tightly wound w the process.
(That I can't share due to client privileges, but I just wanted someone to be sad with me...)
It’s intensive longitudinal data where where the outcome is a binary self-report question on binge eating. There’s 50% missingness and a suspected MNAR process where people don’t respond to the binge eating question when they binge eat