Profile avatar
drewhalbailey.bsky.social
education, developmental psychology, research methods at UC Irvine
74 posts 1,474 followers 271 following
Regular Contributor
Active Commenter
comment in response to post
Although field-specific authorship norms probably mostly just reflect the values of people in the field, I also think they can affect those values too. This seems like a good example! (I have some guesses about unintended consequences of tiny authorship teams too, btw.)
comment in response to post
6) LCGAs never replicate across datasets or in the same dataset. They usually just produce the salsa pattern (Hi/med/low) or the cats cradle (Hi/low/increasing/decreasing). This has misled entire fields (see all of George Bonnano's work on resilience, for example). psycnet.apa.org/fulltext/201...
comment in response to post
But I really hope we get 10 more years of strong studies now on the effects of large increases in access on outcomes for "always takers" and especially for elite students. There are lots of good reasons to expect these effects should differ. (2/2)
comment in response to post
I investigated how often papers' significant (p < .05) results are fragile (.01 ≤ p < .05) p-values. An excess of such p-values suggests low odds of replicability. From 2004-2024, the rates of fragile p-values have gone down precipitously across every psychology discipline (!)
comment in response to post
Hope to see at least one of these in each APS policy brief from now on!
comment in response to post
Really like it!
comment in response to post
(Not saying the public is right necessarily; you can get programs that pass a cost-benefit test with much smaller effects on test scores than laypeople want. But it is a problem for policymakers that the public wants them policy to deliver unrealistically sized effects.)
comment in response to post
If you ask people what kinds of effects they’d need to decide to implement something new, they’re much bigger than realistically sized effects in ed policy. We’ve decided collectively to pretend this isn’t a problem and then get surprised at the backlash when it comes.
comment in response to post
Starting to feel like "don't look at the coefficients, just calculate whatever metric is relevant to your research question" is a highly underappreciated stats hack and also I may have to get myself a marginaleffects T-shirt.
comment in response to post
And you can think of the RI-CLPM as doing something like this too, using repeated measures of the same x over time.
comment in response to post
Not eloquently. But in the appendix of this paper, we show that a "multivariate intercept" model that does this (constraining all loadings to equality) reproduces patterns of causal impacts of some RCTs better than OLS (see Table S4 + Fig S1): pmc.ncbi.nlm.nih.gov/articles/PMC...
comment in response to post
Do one for when people realize the extracted factor might be more useful as a *control* for estimating the effects of interest than as the key predictor of interest.
comment in response to post
You like good music and are in North Carolina: are you into Wednesday?
comment in response to post
The Paul Meehl Graduate School! Very cool.
comment in response to post
Ah got it, thanks. In this case, I guess I agree the link between theory and these statistics is often squishy!
comment in response to post
I think that's the way some people talk about types of validity and reliability. But I view (threats to) validity typologies as compatible with estimands: threats to validity are ways that mapping between estimates and estimands can go wrong!
comment in response to post
Thanks, Jason!
comment in response to post
I agree dosage is a big deal here. I encourage anyone with relevant non-experimental data to regress mother and child outcomes on increments of $4,000 annual income. You might be surprised-I was!
comment in response to post
I think @protzko.bsky.social might have done this.
comment in response to post
I agree this is a main goal of prereg, but I think it follows that prereg incentivizes researchers to think more about the severity of their tests on the front end! Thus, I think these ideas are pretty reconcilable.
comment in response to post
Definitely not a blog post, but I think Shadish, Cook, and Campbell’s section on threats to internal validity is a nice readable summary for an educated non-expert audience. Bet you could make it shorter and funnier though!
comment in response to post
I'm all for more work like this. My colleague, Jade Jenkins, has a similar, transparently reported analysis not finding much. www.tandfonline.com/doi/abs/10.1... Also, for those interested, we reviewed the lit on these kinds of analyses a few years ago here: www.sciencedirect.com/science/arti...
comment in response to post
3) Again, pre-k impacts aren't reported in Table 2, but usually we see most absolute fadeout the first year post tx. Weird that the interaction (which you'd expect would be largest in the years with most fadeout) grows a lot from k to the later elementary years.
comment in response to post
2) Impacts in grade 5 from this study are negatively signed (Fig 1). But Table 2 doesn't report impacts for the reference group (no preschool peers), which must be negative by grade 5. When the effects of hypothesized "quality" inputs are negative for one group, I never know what to make of them.
comment in response to post
1) The only statistically sig interaction coefficient in Table 2 is implausibly large. 10 preschool classmates can't possibly buy you 1 SD cognitive skills. Combine this with researcher degrees of freedom in binning the years together, and I'm worried it may be a false positive.
comment in response to post
Thanks, Eric! I think the design is great and commend them for reading outside the econ literature. Findings could be real, but I am somewhat skeptical for 3 reasons:
comment in response to post
Same age, also Bulls fan. Remember Ben Gordon?
comment in response to post
Ha! There some really interesting implications of an omnicausal world, I think (e.g., www.stat.columbia.edu/~gelman/rese...), but I guess psych talks don’t usually deal with them.