19 Jan Correlation, Prediction and Explanation in the Genome
I said yesterday that GPS are basically an extension of twin and family studies, making it possible to use singletons to conduct the kind of research that used to require twins or other family members. Such research can be useful for some purposes but it has not lead to a revolution in genetic explanation of human behavior.
PvS, on the other hand, say,
GPSs for intelligence will transform research on the causes and consequences of individual differences in intelligence….
This is the core difference that has to get worked out. The key word in the quote from PvS is cause. Their belief, if I understand them right, is that GPS improves on classical quantitative genetics because it based on actual DNA, quantified by us, as opposed to correlation matrices of twins at a distance from the actual DNA. Everything turns on what they mean by cause, and I have to be a little harsh with them on this point. They say,
GPSs are unique predictors in the behavioural sciences. They are an exception to the rule that correlations do not imply causation in the sense that there can be no backward causation when GPSs are correlated with traits. That is, nothing in our brains, behaviour or environment changes inherited differences in our DNA sequence.
Really? Their definition of cause is, if x and y are correlated, and x happens before y, then x causes y?
So, for example.
- Children whose parents drove them home from the maternity hospital in a BMW have higher IQ scores in the third grade than children whose parents took the bus, therefore maternal transportation causes IQ.
- In a mixed Japanese-American sample, a GPS that predicts chopstick use comprises the SNPs that cause people to use chopsticks.
- And, harking back to Paige’s post, Sandy Jenck’s ginger child: In a society in which red-haired kids are systematically abused, SNPs associated with red hair are causes of IQ differences.
These examples point out different issues. Let’s work through the ginger child problem, because I think it is the most far reaching. Why exactly does it seem wrong to call these hair color genes IQ genes? In some sense, they are causal. Under the circumstances described, genes for read hair cause kids to get abused, and the abuse causes IQ differences. But the example shows us something important that applies to all assertions of cause: causes come with domains to which they apply. Our knowledge of the mechanism of the association between red hair genes and IQ shows us that they only cause IQ in a society that abuses ginger children. If we were assigning children to classrooms (looking ahead to tomorrow’s post on ethics) it would be ridiculous and evil to put the kids with ginger genes in the low performing classroom. Instead, we should be working to change society so they don’t get abused. So assertions of cause always pertain to a domain, and in human behavior those domains are often under our control.
The ginger child is a certain kind of argument you find if you have been discussing something for a long time. It has been around so long that it has become a chestnut. I can see hereditarians roll their eyes when I bring it up. But the reason it is a chestnut is that it has never been rebutted. And– I emphasize this throughout– it is not some airy-fairy philosophical point about what counts as a cause, it is the crucial issue on which the whole debate turns. If you doubt that, try substituting “skin color in America” for red hair, and see what happens.
[Elsewhere in the paper, PvS try out a different notion of cause, trying to soften the impact of some ethical concerns by suggesting,
GPSs are ‘less dangerous’ because they are intrinsically probabilistic, not hard wired and deterministic like single gene disorders.
What does it mean to say that the relation between two variables is “intrinsically probabilistic”? I don’t know, but it sounds a lot like a euphemism for, “They are correlated and we don’t know why.” This isn’t quantum physics. Anyway, more on that in tomorrow’s post on ethics]
So the crucial issue in evaluating the implications of GPS as causal agents for IQ is understanding the scope of the domain to which they apply. Not just across populations of people, although that is important, but also across time and situations. Yes, a particular GPS is correlated with IQ now, in the modern West with all the myriad particulars that generalization entails, but will it still work at other times, in other places? If not, then what seem to be SNPs-as-causes really aren’t, they are just markers for some red haired child situation that we can’t see because we are too close to it. And since you can’t in principle expose a GPS to all possible contexts to find out what happens, you are pretty much stuck. Genomics, welcome to social science.
Is there any way out of this dilemma? The answer is that we can short-circuit the problem of investigating all possible contexts by establishing biological mechanism. Take Trisomy as a cause of Down Syndrome and its attendant low IQs. It might be possible to make a specious argument that there might be a world out there somewhere in which special conditions work to undo the negative consequences of Trisomy, but that is indeed airy-fairy. We understand the biology of trisomy, we know what it does to brains and how those brain malformations relate to IQ. Understanding the biological mechanism means that we don’t have to wonder about the effects of trisomy in East Asia or vegetarians or societies with universal health care. We understand it mechanistically, and exactly what it means to understand something mechanistically is that we know how it works across a very wide range of contexts.
All this comes at a moment when PvS, and Robert Plomin in particular, appears to given up on the possibility of understanding the genetic biology of intelligence.
A bottomup approach to intelligence focused on specific genes will be difficult for three reasons. First, genetic effects are extremely pleiotropic. Second, many hits are in intergenic regions, which means that there are no ‘genes’ to trace through the brain to behaviour. Third, the biggest hits have minuscule effects — less than 0.05% of the variance — which means that hundreds of thousands of SNP associations are needed to account for the 50% heritability estimated by twin studies. A sys tems biology approach to molecular studies of the brain is needed that is compatible with this extreme pleiotropy and polygenicity.
It wasn’t very long ago that a Plomin review of the genetics of intelligence was all about gene finding, what he used to call quantitative trait loci, or QTLs. It is remarkable that in a Robert Plomin paper titled “The New Genetics of Intelligence” there is not a single mention of a candidate gene, of a mechanism of any kind. I don’t mean to be too tough on him– Robert Plomin was talking seriously about the genomics of intelligence long before anyone else was, and he should be credited for abandoning an approach that didn’t work and moving on to something else. But it is important to realize what a retreat GPS are from the early expectations of what the genomics of intelligence was going to accomplish. At first, linkage analysis looked for big genes with direct effects, but they weren’t there; then candidate gene studies looked for small genes with direct effects, and they weren’t there either; then GWAS looked for tiny genes with specifiable biological effects, and they weren’t there either; and here we are. In GPS you add those SNPs up without even caring what they are, because your only goal is to predict, not to understand. And even though prediction can sometimes be useful, without understanding there is know way of knowing how broadly it applies, which is to say whether it qualifies as a cause in any meaningful way.
Tomorrow: The Ethics of GPS