6 Comments
User's avatar
ScienceGrump's avatar

Great deep dive into how bad science operates in action. Unfortunately, this kind of thing is more common than not for claims about lifestyle and health. But here is an especially egregious case of a narrative overpowering the data. The only finding worth following up is... that vegetarians have 3x greater risk of esophageal cancers than meat eaters.

ScienceGrump's avatar

On a technical note, I will say that requiring an FDR < .05 is usually not appropriate in the way requiring small p-values is. If the FDR is well-calibrated, that means you're guaranteeing 95% of results can't be explained by the null - a *very* stringent standard of confidence. If it's not calibrated, no threshold can be considered safe.

Adam Rochussen's avatar

It's a fair point. Though the B-H FDR and the unadjusted p-value are directly related, so a small p-value will create a small FDR.

FDR = p-value * total tests / p-value rank

(and then make sure the sequence of p_adj is non-decreasing)

And then what that actually means is that, with the three results in this study that were FDR<0.05, we can say that 5% of those findings are likely to be false positives. You're right that it can be seen as stringent, but I guess that's a subjective take. It's certainly less stringent than other multiple comparisons adjustments (e.g. Bonferroni). For this study, where lots of the associations don't really have much mechanistic rationale, they're basically hypothesis-fishing to then justify follow-up research. For exploratory research like that, I think a pretty stringent FDR cutoff is very necessary.

I'm not entirely sure what you mean by the calibration of the FDR? But certainly this adjustment all depends on assumptions of test independence being true etc (which may not be the case for multiple cancers in different parts of the body). The sensitivity analysis can therefore be a better robustness check, but that too filtered out the big headline results lol.

ScienceGrump's avatar

What I mean by miscalibration is really a misspecified null: you have some mathematical function that doesn’t match the null hypothesis you claim to be testing. This can happen in lots of ways. Correlation is a very big one. But you might also assume that errors are normal when they are kurtotic. In any case where authors try to “correct” for cofactors, there are specific assumptions about the ways covariates can influence with the variable of interest. Most often, it's just failing to account for a source of error altogether.

I agree it is a big problem if the correction is on a bunch of ad hoc hypotheses. BH only works if the hypotheses are systematic and unbiased. But in that case I don't trust an FDR < .05 either

Adam Rochussen's avatar

Gotcha. Makes sense, thanks!

Adam Rochussen's avatar

Thanks! Agreed. It's super common in fields where proper causal tools are limited (nutrition, climate science, sociology etc).