11 Jun Causation and Mechanism
Over the last couple of days, Paige Harden ran a poll about causation in the genome.
There was a lot of very interesting discussion that you should have a look at. My take was to wonder whether “cause” is the most important thing we are trying to determine here:
Some discussion ensued, which you can look at. For the uninitiated, “red-haired kids” is code for an example in which there is a society that abuses ginger kids, who then have low IQs. In that society genes for red hair, it seems to me uncontroversially, “cause” low IQ, but I don’t think that kind of cause is what anyone is looking for here.
I should say that I hate to disagree with Paige, in part because she was my student, but also because I have enormous respect for her and her opinion. She is obviously well on her way to being a world authority on the science and philosophy of complex human genetics. She is also a fierce scientific debater, and one crosses her at some peril. But on the other hand, I spent my years at UT Austin in continuous debate with my great, equally fierce, (and sadly, long gone) mentor Lee Willerman. So all in all, it’s a good thing, and the truth is I don’t think our disagreement runs very deep.
Anyway, as I often suggest, it is difficult to evaluate a polygenic score, or any other genetic effect, without some notion of “mechanism”. Mechanism. like cause, is a concept for which one could descend down a deep philosophical rabbit hole, but I will avoid that. All I mean is that it is important know how you get from gene to behavior.
In some ways it seems like an odd thing for me to insist on. I am no biologist, no bench scientist, and it is safe to say I have never elucidated a mechanism in my life. Many people who disagree with me react as though I am serving on some sort of official committee tasked with coming up with standards for when we should take genetic effects seriously, and I am insisting on way too stringent a test, but that’s not the point. There are just domains in which we can’t evaluate a genetic effect unless we have some idea about how the damn thing works.
Let’s start with one where it doesn’t matter, which is if you are using a score inside a larger social scientific design to fill out an understanding of a complex social process. Instituting a new curriculum in classrooms, and you want to control for the counterfactual tendency to perform well in school. There is a lot of work of this kind going on nowadays, and although like most research, it’s investigators, in their enthusiasm, sometimes oversell it, fine. I have done tons of work of this kind myself, with twins rather than PGS.
I think there are two broad categories where mechanism matters. One is if you are making strong scientific claims about the effects of the polygenic score, if say, like Robert Plomin in Blueprint, you are are claiming that polygenic scores show that DNA “makes us who we are”. The second is if you are using the scores to make decisions about individual people in the real world.
Think about correlation and causation for a moment. Allele X and IQ score Y are correlated .01. What does that mean? The first possibility is that the correlation is completely spurious, just a consequence of random sampling error. This determination (and only this determination) is what significance testing is for. But if the correlation survives some valid NHST process, there has to be some kind of causal process that produced it. It didn’t come out of the ozone. It might be that the allele codes for proteins that help the brain build better “dopamine sprayers” (as Jim Flynn says) that makes more efficient brains and better thinkers. It might be some kind of red-hair process ( and this is not an unrealistic philosophical example: try skin color), where X causes Y, but only under very particular environmental conditions that are a more natural target of causal intervention. It might be that X and Y are both caused by some third variable confound, like culture– that is population stratification. It might be that X causes Y, sometimes, at the end of an impossibly long causal chain, like the recommendation letter that causes a kid to win the Nobel Prize thirty years later.
There are perfectly good reasons to suspect that ALL of these things are going on under the hood of any polygenic score. I have no reason to insist that the dopamine-sprayer part is zero. I think we probably are all born with genetic characteristics that constrain our eventual outcomes in real ways. But I don’t see how we can evaluate those relations in the complete absence of knowledge about what is going on under the hood. That makes it very very dangerous to make too-strong scientific or practical claims about what polygenic scores can do.
Try it this way. Let’s say your SAT scores are half a standard deviation higher than mine. Your EA is also half a SD higher, and your parents made half a SD more money than mine. Your PGS for SES is also half a SD higher than mine. All of these predictors are correlated with each other in the population. Question: why are you smarter than me? The answer is, we have no freaking idea, unless we are happy with old-fashioned platitudes about genes and environment working together. And given that we don’t know, how could it be a good idea to declare as scientists that my low SATs are caused by my PGS, or have a school make decisions about my curriculum based on them? On the other hand, if we had a working dopamine-sprayer model, and I had the bad sprayers, you would have a basis for attributing my low scores to my genes by way of my neurons. It would be like assigning individuals with Down Syndrome to special curricula. Still some room for unfairness, but understandable and scientifically sound.
One more note about an issue that lurks behind this entire discussion: group differences. If we just accept that PGS are “causal” in a generic way, it seems to me that we are well on the way to showing that they cause group differences as well as individual differences. In fact, as I have elaborated elsewhere, I think group explanation is just a corollary of individual explanation. In the example above, if my SAT scores are lower than yours because my PGS causes them to be lower, then groups of people like me have lower PGS than groups of people like you for the same reason. But that’s the point. Just as in the absence of mechanistic knowledge there is no way to sort out why you are smarter than me, in the absence of such knowledge it is impossible to sort out why Group X is smarter than Group Y.
So the best activity, for all of you who want to establish causal genetic effects, is to look for mechanisms. If you find them, all of the talk about causation will stop being philosophy and turn into science. In the meantime, it would be best to be very very careful about your claims. One question, which I sincerely don’t know the answer to: is anyone studying human PGS in model organisms? Is there enough genetic homology between primates or rats and human beings to study EA under experimental or post-mortem conditions? Just a thought.