Saturday, 20 October 2007

Shalizi on IQ again

It is always interesting to follow the Crooked Timber threads about IQ, and, following on from Watson's little outburst they are discussing another post (and now also here) by Cosma Shalizi on g. As before I think they're split into the more leftwing and right-on who are just a little bit too dismissive of claims such as that people in Africa might have lower intelligence on average due to poor nutrition or somesuch, and the more rightwing and willfully anti-PC who are a bit too keen to attribute the problems of Africa to low intelligence rather than the myriad other factors it has to face. [it is interesting to consider, re: the Flynn effect, how European populations of the past, with much lower IQs than we have now, seem to have managed just fine]

Again I think Shalizi has nicely, if a little convolutedly, explained how the existence of correlations between different cognitive tests inevitably leads to a common factor such as g, and does not tell us anything about whether that factor represents some underlying physical reality.

This sort of thing highlights a general problem with psychological science which is not grounded in more basic physiological science - often the explanations given are simply redescriptions of the phenomenon described - and there is thus no actual explanatory value in the models.

Shalizi also makes some valid complaints about those investigating IQ differences between racial groups and their generally poor methodology looking at just means, variance and correlations, which rather poorly control for covariates with simple regressions, rather than looking at the whole distribution. He also refers to an interesting paper that shows that racial differences are reduced when prior knowledge is controlled for. My concerns with claims to have controlled for all relevant socioeconomic factors and the racial IQ difference persisting stem from objections like this - you'd think a rather more comprehensive attempt would be made to find additional covariates, and rather better attempts at matching for them, rather than naive linear regression, would be used before categorical claims like this were made.

I think I would like to take exception to Shalizi's claims about the modularity of cognitive domains:
"If we must argue about the mind in terms of early-twentieth-century psychometric models, I'd suggest that Thomson's is a lot closer than the factor-analytical ones to what's suggested by the evidence from cognitive psychology, neuropsychology, functional brain imaging, general evolutionary considerations and, yes, evolutionary psychology (which I think well of, when it's done right): that there are lots of mental modules, which are highly specialized in their information-processing, and that almost any meaningful task calls on many of them, their pattern of interaction shifting from task to task....But the major supposed evidence for it is irrelevant, and it accords very badly with what we actually know about the functioning of the brain and the mind."
I think the evidence for separate cognitive, as opposed to perceptual, modules is rather over egged.


Anonymous said...

Did you read the Pharyngula post on the Shalizi one? I started off hoping to use it to learn, but it rapidly turned into a sort of on-line lynch mob.

How does CS's post fit with this one? As usual, I find myself floundering in marshes of poorly-understood statistics, but it seem to me that the claims that g "has no underlying physical reality" is not inconsistent with the claim that it is highly predictive of all sorts of things. If it is "just" a statistical artefect, it presumably makes it less useful for basing social policy on (because it would make more sense to find one of the underlying components to try to intervene with respect to) but it is still interesting. Rather like a stock market index?

Does this make any sense?

pj said...

Oh sure, it could be predictive, just like it could be heritable, but that predictive power is because it supervenes on some other real physical factors.

That's what I was saying about psychology, g isn't necessarily explanatory in the sense that g isn't what causes people to say do better in a given job, rather g is correlated with whatever it is that causes it.

My reading of CS's post is that he is targetting people that think the existence of g somehow proves that there is some real raw causal cognitive factor that drives our intelligence. I think CS talks about memory, and I think reaction time is similar, as being the closest we have to real factors that seem to have an actual causal role, but I think the effect size is much lower than that predicted for g:

"All of this, of course, is completely compatible with IQ having some ability, when plugged into a linear regression, to predict things like college grades or salaries or the odds of being arrested by age 30. (This predictive ability is vastly less than many people would lead you to believe [cf.], but I'm happy to give them that point for the sake of argument.) This would still be true if I introduced a broader mens sana in corpore sano score, which combined IQ tests, physical fitness tests, and (to really return to the classical roots of Western civilization) rated hot-or-not sexiness. Indeed, since all these things predict success in life (of one form or another), and are all more or less positively correlated, I would guess that MSICS scores would do an even better job than IQ scores. I could even attribute them all to a single factor, a (for arete), and start treating it as a real causal variable. By that point, however, I'd be doing something so obviously dumb that I'd be accused of unfair parody and arguing against caricatures and straw-men."

I only got as far as PZ is a fundamentalist before giving up - I see it has reached several hundred posts - I used to hang out on PZ's threads back in the day (to disagree with him on things like the selfish gene), but I find there's too many people there, and too little meaningful discussion.

Anonymous said...

rather g is correlated with whatever it is that causes it Yes, and "whatever it is" is probably not one thing but lots of things each contributing a little bit. If it was a very small number of big things, we might as well ditch g altogether, but since it isn't g might still be useful or interesting?.....interesting, anyway, I shall have to think more about what (if anything) I mean by useful, perhaps using the market index analogy.

Anonymous said...

Good grief. I made it to comment 24 on the Crooked Timber post - it's even worse than the Pharyngula one.

pj said...

I thought on the last IQ post there, and to a lesser extent on this one, that Brett Bellmore, while he may in fact be a bit of right wing loon, appeared to understand the Shalizi posts better than those baiting him.

Anonymous said...

Yes. There are lots of people on both threads who appear to be arguing, from the safety of a psedonym, on the basis of an apparent authority who has made a long and authoratative-looking post which coincides with their political prejudices, rather than on the basis of actually trying to understand the issues themselves even in the most limited way. The atheist equivalent of being holier-than-thou.

One or two interesting bits in the comments to this post (OT to the post).

pj said...

A month later, someone seems to think that this post or thread is an example of "naive third parties who find the original argument somewhat compelling" where "people who were generally skeptical of the racist arguments also tended to say things like '...but Brett Bellmore seems a lot saner than the enraged mob opposing him, nevertheless.'"

Now what I actually said can be seen above, and I stand by it, I posted the same sentiment on the crooked timber post, and I've expressed similar sentiments about, and on a previous post.

And obviously I'm not impressed at being characterised as a naif on this issue, where I at least have some form.

I've posted a reply, we'll see if it appears.

pj said...

Well posted my reply, but not the follow up where I explain why I think that the comments on those crookedtimber threads reflected a misunderstanding of Shalizi's arguments, and why I think they made those mistakes.

Anonymous said...

Here is a critique which also mentions Shalizi's comment, and why he is missing the point:

"Jake, this is a good review and I agree with many of your major conclusions. However, your summary of the literature on g has several problems.

[g-factor] s predicated on the notion that performance across different cognitive batteries tends to be positively correlated

A quibble -- the positive correlation between performance on different test items is not just a notion but an empirical observation that has been supported by millions of data points over the last century. More on this below.

Psychological tests for g-factor use principal component analysis -- a way of identifying different factors in data sets that involve mixtures of effects.

Factor analysis, not PCA, is the method used by psychometricians. They are similar in principle but not in application.

g-factor is very controversial.

Not among intelligence researchers.

In this review, we emphasize intelligence in the sense of reasoning and novel problem-solving ability (BOX 1). Also called FLUID INTELLIGENCE(Gf), it is related to analytical intelligence1. Intelligence in this sense is not at all controversial...

[These authors go on to explain that in their view Gf and g are one and the same.]

From another review:

Here (as in later sections) much of our discussion is devoted to the dominant psychometric approach, which has not only inspired the most research and attracted the most attention (up to this time) but is by far the most widely used in practical settings.

This was published over a decade ago. The psychometric approach has continued to attract the most research and attention and is still by far the most widely used.

The second and broader critique of this work is whether the tests that we have for "intelligence" measures something useful in the brain.

There's wide agreement that the tests measure something useful about human behavior:

In summary, intelligence test scores predict a wide range of social outcomes with varying degrees of success. Correlations are highest for school achievement, where they account for about a quarter of the variance. They are somewhat lower for job performance, and very low for negatively valued outcomes such as criminality. In general, intelligence tests measure only some of the many personal characteristics that are relevant to life in contemporary America. Those characteristics are never the only influence on outcomes, though in the case of school performance they may well be the strongest.

A more standard criticism of g:

while the g-based factor hierarchy is the most widely accepted current view of the structure of abilities, some theorists regard it as misleading (Ceci, 1990).
that is:

One view is that the general factor (g) is largely responsible for better performance on various measures40,85.A contrary view accepts the empirical,factor-analytic result, but interprets it as reflecting multiple abilities each with corresponding mechanisms141. In principle, factor analysis cannot distinguish between these two theories, whereas biological methods potentially could10,22,36. Other perspectives recognize the voluminous evidence for positive correlations between tasks and subfactors, but hold that practical, creative142 and social or emotion-related73 abilities are also essential ingredients in successful adaptation that are not assessed in typical intelligence tests. Further, estimates of individual competence, as inferred from test performance, can be influenced by remarkably subtle situational factors, the power and pervasiveness of which are typically underestimated2,136,137,143.

The concepts of IQ and g-factor have been questioned by several authors. Stephen Jay Gould actually wrote a whole book -- The Mismeasure of Man -- trying to debunk the assumption that intelligence can be measured in a single number. (For a more recent and excellent critique, I recommend this article by Cosma Shalizi.) The common theme among many of these critiques is that the tests for intelligence conflate numerous separable brain processes into a single number. As a consequence, 1) you aren't sure what you are measuring, 2) you can't associate what you are measuring with a particular region (the output may be the result of an emergent process of several regions), and 3) you may be eliding significant differences in performance across individuals that you would recognize with a better test.

You give too much credit to Gould and Shalizi. Their primary criticisms are entirely less reasonable than the points you make.

The main thrusts of their arguments are that test data do not statistically support a g-factor. Gould's argument is statistically incompetent (for a statistican's critique see Measuring intelligence: facts and fallacies by David J. Bartholomew, 2004). Shalizi's criticism is incredibly sophisticated, but likewise incorrect. In a nutshell, Shalizi is trying to argue around the positive correlations between test batteries. If those correlations didn't exist, his argument would be meaningful. However, as I noted above, these intercorrelations are one of the best documented patterns in the social sciences.

significant differences in performance across individuals that you would recognize with a better test.

It's possibly not well known that enormous efforts have gone into trying to make tests that have practical validity for life outcomes yet do not mostly measure g. See for example the works of Gardner and Sternberg. The current consensus is that their efforts have failed. A notable exception might be measures of personality.


Ultimately, we need to use biological measures such as cortical volume to determine what g really is. One possible approach is to combine chronometric measurements (e.g. reaction time) with brain imaging studies. Genetically informed study designs have a role to play here too.


pj said...

"Shalizi's criticism is incredibly sophisticated, but likewise incorrect. In a nutshell, Shalizi is trying to argue around the positive correlations between test batteries. If those correlations didn't exist, his argument would be meaningful. However, as I noted above, these intercorrelations are one of the best documented patterns in the social sciences."

I think I disagree with this. His argument is about whether correlations between instruments means that there is some underlying 'g' that causes these correlations. This counter-argument appears to just be saying 'no, lots of correlations means an underlying causal factor' - this is simply a false claim as Cosma Shalizi demonstrates - but I think it is more important to understand that a statistical factor 'g', an inevitable result of multiple correlations, is simply a statistical, not a physical construct.

Sure, there could be some underlying physical process, be it neuronal processing speed or whatever that underlies 'g' and thus makes it a real physical construct rather than a derived statistical construct - but until then we cannot say that just because we can derive a single factor 'g' it is in any useful way 'real'.

Note that in other fields such as psychiatry people do not correlate all their imperfect instruments together and then try to claim that these correlations prove a single underlying 'mental illness factor'.

Anonymous said...

"Sure, there could be some underlying physical process, be it neuronal processing speed or whatever that underlies 'g'"

Some recent research along these lines:

"By comparing brain maps of identical twins, which share the same genes, with fraternal twins, which share about half their genes, the team calculate that myelin integrity is genetically determined in many brain areas important for intelligence. This includes the corpus callosum, which integrates signals from the left and right sides of the body, and the parietal lobes, responsible for visual and spatial reasoning and logic (see above). Myelin quality in these areas was also correlated with scores on tests of abstract reasoning and overall intelligence (The Journal of Neuroscience, vol 29, p 2212)."

"We report associations between a general cognitive ability factor (as an estimate of g) derived from the four subtests of the Wechsler Abbreviated Scale of Intelligence and cortical thickness adjusted for age, gender, and scanner in a large sample of healthy children and adolescents (ages 6–18, n = 216) representative of the US population. Significant positive associations were evidenced between the cognitive ability factor and cortical thickness in most multimodal association areas. Results are consistent with a distributed model of intelligence."

Volume 37, Issue 2, March-April 2009, Pages 145-155
Intelligence and the Brain

"n these ROIs, gC was more strongly related to structure (cortical thickness) than function, whereas gF was more strongly related to function (blood oxygenation level-dependent signal during reasoning) than structure. We further validated this finding by generating a neurometric prediction model of intelligence quotient (IQ) that explained 50% of variance in IQ in an independent sample. The data compel a nuanced view of the neurobiology of intelligence, providing the most persuasive evidence to date for theories emphasizing multiple distributed brain regions differing in function."