see Peter’s reply - still doesn’t get around the problem that it’s scale/model dependent and generally you need to make stronger assumptions one way or another
Comments
Log in with your Bluesky account to leave a comment
Maybe I’m missing something but isn’t this all worded way to generally? A simple gender difference in an experimental effect would be moderating variable, right?
This would be something that is plausibly causally identified, but it would still be scale dependent. If you look into the psych literature, there’s very early work on how interactions that don’t cross-over are “removable” (in the context of experiments). It’s not necessarily a “problem” tho
Also note that "removable" here is a mathematical term: an interaction is removable if we can nullify it with a monotonic transformation. But just because an interaction is "removable" doesn't mean it isn't substantial!
Of course being able to recognize when an interaction is substantial even though it's removable requires a skill psychologists fear: understanding the scale of their measures!
Yeah exactly. I feel the removable issue as an issue precisely when you’re unwilling to commit to a scale because it’s not something you justify by running some statistical test…
Omg, I recently had a chat with LLM about this and out of nowhere it finished with a quote (and more) from Alfred Korzybski: "The map is not the territory — the scale you choose shapes what you see"
Actually my favorite scale dependent interaction from my own work is precisely the scenario you describe! Randomized treatment: being seated next to each other; outcome: whether students befriend each other; moderator: gender-combination of the two students.>
The interaction coef in the probit actually isn’t significant. If you look at the friendship probability, effect is much bigger for gender-matched dyads (which makes a lot of sense). If one did relative risks instead, effect would be *bigger* for mismatched dyads! https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0255097
In brms, no less. This was a bit improvised because the other two had preregistered using a package that couldn’t do nested models, and also they wanted average marginal effects about which I had never heard before, so I made them “from scratch” using the posterior draws. Good times actually
Neither of these answers is wrong; one needs to know which scale cares about. Psych would prioritize the coefficient (so the latent underlying scale assumed by the model), for my co-authors from sociology looking at probabilities was a no-brainer. We all considered relative risk way out there lol
In my opinion, if you observe effect measure heterogeneity by some third variable (e.g. Sex), then in practice you have simply shown that - on that scale - there is some signs of additional non linearities among some terms in your model that is being picked up by an interaction term.
yep, I think the difficulties become clear if you think of linearity and additivity being simplifying assumptions that we make in order to be able to feasibly specify and estimate high-dimensional* models. if you really wanna treat linear models as privileged, well, good luck!
I’d think that probably depends on your goals/level of ambition. But simplifying assumptions will usually (always) be needed to some degree. In a lot of applied contexts I think linearity assumptions are fair.
"fair" in the sense of "the quality & quantity of data doesn't justify anything further"? absolutely. but then be careful interpreting departure from linearity/additivity
(I'm saying this as someone who's done this a bunch in the past and helped others do the same... now ended up jaded and cynical)
@ruben.the100.ci will be delighted to hear that you liked it, in particular after I deleted the Genghis Khan quote on our other paper on what makes people happy.
Is scale dependence a problem or just a reality? Charlie Poole likes to use terms like risk difference modifier or additive scale modifier or risk ratio modifier making it clear which type of modifier you’re looking for.
I think the scale dependence thing is more that people’s theorizing is often so imprecise (“this will interact”) that an interaction on any scale would count, and then you could always just transform the data until you find said interaction, leaving it unclear what precisely has been learned…
now you’re opening up the “real science needs Theory behind it” can of worms, and while I don’t entirely disagree, so much of medical (and social science) research is closer to “throw science at the wall and see what sticks, then explain it later” in practice
Yeah I agree. That’s what happens in practice. Although in epi it’s more just that people look on the ratio scale when they have binary outcomes and the additive scale when they have continuous outcomes. But I’m just trying to convey the point that the problem is the way it’s taught as though…
…modification is a thing that exists independent of scale. The Charlie Poole way (which I’ve started teaching in the past few years) is to not use the term effect measure modifier or moderator but instead just always say what is being modified (risk difference, ratio etc).
I’m being a bit picky with language here but rather than scale dependence being the problem, it’s the thinking that things like moderators or modifiers, without reference to scale, exist.
So I was thinking the same thing re scale and actually think there is a possibility to actually strengthen your argument by hypothesizing the correct scale of effect modification especially if you ground it in theory a priori with a plausible mechanism
This part I am less sure about predicting what scale should be impacted I feel like multiplicative is compounding risk at an indv lvl vs additive is about excess risk generated at at the pop level. Even though I think this vibes based idea is not correct since both scales are measured at pop level
This is indeed interesting! There are some known (simple) data generating mechanisms under which the causal risk ratio is stable while the causal risk difference (approximated by the survival ratio) is not - and vice versa: https://arxiv.org/pdf/2106.06316 or https://doi.org/10.1093/aje/kwaf086.
I admire people who are able to provide a theoretical justification from one scale vs another!
I see the choice of scale for a statistical model as a mix of convention and convenience, and I find it hard to convince myself that this necessarily bears any relationship to the underlying mechanism.
I think you can model the outcome on any scale but if you’re on the scale that most or a majority of covariates are operating on then you’ll have fewer non-linear terms.
Humans rarely check for non-linearities though so I’d say the choice of scale matters.
Comments
* dimensionality > 1
(I'm saying this as someone who's done this a bunch in the past and helped others do the same... now ended up jaded and cynical)
On the other hand I don’t have a good grasp on the depth of this problem (other than that you’re in luck if you have a proper cross-over interaction).
https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.2639
I shall check it out.
I see the choice of scale for a statistical model as a mix of convention and convenience, and I find it hard to convince myself that this necessarily bears any relationship to the underlying mechanism.
Humans rarely check for non-linearities though so I’d say the choice of scale matters.
It’ll be interesting to see whether more work is done in the future on how to think mechanistically about choice of scale.