Fruit of Eden

On a technical note, I will say that requiring an FDR < .05 is usually not appropriate in the way requiring small p-values is. If the FDR is well-calibrated, that means you're guaranteeing 95% of results can't be explained by the null - a *very* stringent standard of confidence. If it's not calibrated, no threshold can be considered safe.

It's a fair point. Though the B-H FDR and the unadjusted p-value are directly related, so a small p-value will create a small FDR.

FDR = p-value * total tests / p-value rank

(and then make sure the sequence of p_adj is non-decreasing)

And then what that actually means is that, with the three results in this study that were FDR<0.05, we can say that 5% of those findings are likely to be false positives. You're right that it can be seen as stringent, but I guess that's a subjective take. It's certainly less stringent than other multiple comparisons adjustments (e.g. Bonferroni). For this study, where lots of the associations don't really have much mechanistic rationale, they're basically hypothesis-fishing to then justify follow-up research. For exploratory research like that, I think a pretty stringent FDR cutoff is very necessary.

I'm not entirely sure what you mean by the calibration of the FDR? But certainly this adjustment all depends on assumptions of test independence being true etc (which may not be the case for multiple cancers in different parts of the body). The sensitivity analysis can therefore be a better robustness check, but that too filtered out the big headline results lol.

ScienceGrump

What I mean by miscalibration is really a misspecified null: you have some mathematical function that doesn’t match the null hypothesis you claim to be testing. This can happen in lots of ways. Correlation is a very big one. But you might also assume that errors are normal when they are kurtotic. In any case where authors try to “correct” for cofactors, there are specific assumptions about the ways covariates can influence with the variable of interest. Most often, it's just failing to account for a source of error altogether.

I agree it is a big problem if the correction is on a bunch of ad hoc hypotheses. BH only works if the hypotheses are systematic and unbiased. But in that case I don't trust an FDR < .05 either

Gotcha. Makes sense, thanks!

Thanks! Agreed. It's super common in fields where proper causal tools are limited (nutrition, climate science, sociology etc).

☔Jason Murphy

Mar 18

I may be weird but I get my Science news by scrolling PubMed, sorting by trending.

So I read the original paper and was not that surprised to see vegetarians getting more cancer. I was vego myself for a number of years and it leads to eating a lot of processed food!

One side-effect of scrolling PubMed is you get a strong sense of how limited science is. Lot of tiny mechanistic studies in vitro. A handful of randomised clinical trials - far fewer than you'd expect - mostly delivering very marginal benefits. Loads and Loads of Reviews. Most work being done on cancer (that's where the money is). And most published papers simply not being worth even a glance.

Mar 19

I don’t think I’ve ever heard of anyone doing this 😂

Makes sense. You’ll get a good cross-section. You’re right about the absence of good science. Even many RCTs are really badly designed to the point that they’re basically worthless. Nutrition science seems to be particularly stained by this.

watchdominion.com

Mar 18

Re: Except there is nothing causal to be inferred here at all. Zero. These are purely observational data. Correlation does not imply causation.

When you say "nothing causal to be inferred", do you account for different credences? For example, prospective cohort studies (even nutrition) have strong agreement with RCTs, so I can make a more limited causal inference as I would if I was reading a RCT.

Also, how do you know confounding variables are confounding, considering that we only consider them confounding based on observational data? If we claim something is confounding, we are making a casual claim. Do we have e.g. RCTs on these variables?

Mar 18

Perhaps “nothing causal to be inferred here” is too strong. Certainly nothing causal is proven because it wasn’t directly tested. I guess anyone is welcome to make inferences from observational data. And some aspects of observational data can make such inferences slightly more likely to be accurate (eg dose-response relationships).

I disagree that identifying a confounder equals “making a causal claim”. A correlation can be confounded by a third variable. Perhaps this is just a semantic disagreement?

And we don’t need RCTs on third variables to know if it is a confounder. Simply adjusting for the covariate reveals if it confounded the original correlation or not.

watchdominion.com

Mar 18Edited

I think what it means for e.g. smoking to confound observational data on the healthiness of drinking alcohol is that smoking is associated with drinking alcohol and lays in the casual chain for some of the outcomes observed in the alcohol group. If it didn't, then it would not confound the data.

I understand we can adjust for a covariate to reveal its confounding, but why did we adjust for that particular covariate to begin with? Why would we adjust for smoking? If it's simply associated with drinking, that doesn't mean it can cause any of the outcomes we measured, so it doesn't mean it can confound. Are we just confounding because it is a covariate? I can imagine a lot of spurious covariations. Why not adjust for sunscreen usage, or tattoo region, or icecream preference, if those covary? Is there a p-hacking equivalent for adjusting for everything that covaries until we find a confounder?

Or maybe there is a principled way of sorting spurious covariation from plausibly confounding covariation, like all of our observational data on smoking.

Mar 19

Yeah I think adjusting for things that don’t need to be adjusted for, or not adjusting for things that should be adjusted for, basically amounts to p-hacking. It’s true you need a hypothesis of a causal chain to decide what to adjust for etc. But in terms of proving causality, observational studies can never achieve that.