So we can see that I've pretty much replicated their finding of a .32 effect size (95% CI .24-.41) and this holds if we exclude studies with group sizes below 40.
I think it is interesting to note that this study hasn't told us much more than we already knew since you'll note that effect sizes are not exactly huge if we were looking for a d > .5 'medium' effect. You'll also note that our confidence limits do not include .5 so NICE would classify it as:
"There is evidence suggesting that there is a statistically significant difference between x and y but the size of this difference is unlikely to be of clinical significance."
Someone somewhere was asking about how effect size is influenced by study size (commonly used as a proxy for study quality). I've already said that excluding small samples doesn't affect the conclusions and looking at a simple scatter plot, if anything, larger studies have a smaller effect size. The funnel plot I've derived here is also unremarkable. Excluding the outlying study of mildly depressed subjects doesn't make much difference either.
The interesting thing is to look at the data split by individual antidepressant. We can see that paroxetine and venlafaxine have larger effect sizes (both .42, with confidence intervals crossing .5) than nefazadone and fluoxetine (.22 and .24 respectively, neither CI crossing .5). In the PLoS study they remark that:
"Although venlafaxine and paroxetine had significantly (p [less than] 0.001) larger weighted mean effect sizes comparing drug to placebo conditions (ds = 0.42 and 0.47, respectively) than fluoxetine (d = 0.22) or nefazodone (0.21), these differences disappeared when baseline severity was controlled."But that is a rather troubling caveat to their overall conclusion. What they are saying is that their regression analysis suggests that the venlafaxine and paroxetine trials enrolled more severe patients and that could be why they had greater responses to the medication. But at the very least we must conclude that in the trials that were actually performed and submitted to the FDA there was a reasonable effect size due to these two drugs (we might also conclude that there was little evidence of a meaningful effect size of the other two). However, according to the NICE criteria we should still say that for venlafaxine and paroxetine:
"There is evidence suggesting that there is a statistically significant difference between x and y but there is insufficient evidence to determine its clinical significance."