Now it is certainly good that Boots has conducted a clinical trial, published in the British Journal of Dermatology no less, but it is, I'm afraid, bollocks.
Scientists say they have clinical proof that a face cream available on the high street does reduce wrinkles.
Five months' worth of stock of the leading brand sold in a day after Professor Chris Griffiths announced in 2007 it appeared to combat sun damage.
Two years on from the BBC Horizon programme showcasing his work, his team has shown the cream visibly smoothes out the skin.
Boots predicts boom sales of its No 7 Protect & Perfect Intense Beauty Serum.
I'm going to ignore everything in the paper other than the clinical trial as they are, to be blunt, irrelevant. So what did they do? Well they randomised 60 adults to placebo or the face cream in question to apply to their hands and face for 6m -and they looked at four measures, clinical scales for fine lines and wrinkles, dyspigmentation, and the overall clinical grade of photoageing and tactile roughness at baseline, 1m, 3m, and 6m.
So what did they find? Well they report that at 6m there was no statistically significant difference on any of the measures - including improvement in facial wrinkles (compared to baseline) where 43% who had used the product improved and 22% of the placebo group (that's a relative improvement of 1.89 times, 95% CI .86-4.0, with a p-value of .11*). As the paper says:
"the test product did lead to a noticeable clinical improvement in facial wrinkles...in 43% of treated individuals after 6 months, compared with only 22% of those treated with the vehicle...In a comparison between groups, this improvement was not statistically significant" (my emphasis)Huh? Yes, that's right, no differences. So why is this supposed to be a positive study? Well that's because they were rather sneaky, after 6m - at which point their were no significant differences remember - they stopped doing a blinded placebo controlled trial and put all the subjects on the face cream. They then extrapolated (using linear regression, and presumably the 1m, 3m, and 6m response rates) the placebo response to 'guess' what the 12m response rate might be. They found that 70% of the 60 people now getting cream had improved facial wrinkles (not improved hand wrinkles, or improved dyspigmentation, photoageing, or tactile roughness of the hands or face**) while they estimated that only 33% of the extrapolated placebo group would have improved.
Now we don't know anything about this regression because they don't tell us any data from baseline***, 1m, or 3m, but you might argue that 33% seems a rather low rate, and, since we might want any regression to go through 0,0 (since at baseline there can have been no improvement) and only have data from 6m presented we could suggest that 44% would be just as reasonable a 12m response rate for the placebo group.
This is an inherently dodgy way to go about analysing the data (and it gives free additional sample size artificially inflating the power of the study) which is now not even from a blinded randomised trial but instead a open label trial (everyone now knows they're on the cream, not placebo) but if we look at what the results give we might find that, assuming the placebo group is 30 (and, of course, that group doesn't really exist) and the cream group has 60 people we get a response rate of 70% for cream and 33% placebo (2.10 times relative improvement 95% CI: 1.23-3.58, p=.006****) - if we assume my 44% placebo response it is 1.62 times 95% CI: 1.04-2.51, p=.03).
However, we've also forgotten that they made a lot of statistical comparisons, we'll let them off the comparisons at 1m, 3m, and 6m (they're not independent anyway, if this had been a pain relief trial, say, they might have tried to make something of them if they had proven to be signficant early in the trial and then became non-significant later on - but that is unlikely here) but they did do 4 measures on each of the hands and face - that's 8 sets of statistical tests - so our error rate of p=.05 will get inflated with all those tests (which each have a 5% error rate) so we need to correct for that - a simple Bonferroni correction implies that we need to multiply the p-values by 8, which makes the 70%-33% comparison barely significant (p=.048) and the 70%-44% comparison non-significant (p=.24).
It is worth noting that although only 13/30 showed an improvement with 6m treatment when the placebo arm was added in and another 6m of treatment given, assuming the original 13 sustained their improvement, a whacking great 29 further people showed an improvement (i.e. we might think that a second 6m had the same response rate as in the first 6m, doubling the response rate for that first 30, plus an extra 30 people have that same 6m response rate). I find that pharmacologically unlikely*****.
Take home message - they did a randomised blinded clinical trial for 6m and found no statistically significant effects of the Boots cream (or even remotely nearly significant given the necessary multiple testing correction). They then did a non-trial where they essentially made-up placebo control group results and gave the cream to all the real patients in a non-blinded fashion. And then, maybe, they have a borderline statistically significant result.
The data is sufficiently badly presented, and given that the clinical trial is what, ultimately, they'll use to sell it, that I'd say they have deliberately done dodgy stats to hide the negative nature of the data. God knows what the Manchester researchers were thinking, and I despair of the British Journal of Dermatology and its peer review.
I wonder what the following were thinking when they said:
I wonder if they actually read it (it was published on the 28th - the same day as the BBC article - the media have a habit of asking for quotes before anyone gets to read the article).
"Nina Goad of the British Association of Dermatologists said: "Approximately one in five people using the cream will get something extra for their money over plain moisturisers. "It is an interesting step forward in research although the long term benefits are unknown. "The main preventable causes of skin ageing are sun exposure and smoking, so if you're worried about wrinkles, limiting these factors is sensible."
Dr Nick Lowe, clinical professor of dermatology at UCLA School of Medicine, said: "The previous rapid study reported from this group measured fibrillin a substance that predicts the formulation of collagen. More collagen should result in skin rejuvenation. "This latest longer study over six months appears to confirm skin rejuvenation as measured by dermatology examination."
Dr Richard Weller, senior lecturer in dermatology at the University of Edinburgh, said: "This is, as far as I am aware, the first properly conducted placebo controlled, double blind trial of an over the counter cosmetic product. Boots are to be congratulated for doing this."
NHS Behind the Headlines has also covered this story.
Acknowledgements to the observations of willowtree and BenFranklin in the thread.
We can see here from the press release from Manchester that there was specific misrepresentation:
"The study, published online in the British Journal of Dermatology today (Tuesday, April 28), showed that 70% of individuals using the beauty product had significantly fewer wrinkles after 12 months of daily use compared to volunteers using a placebo." (my emphasis)
* Maentel-Haenszel assuming 30 in each group - we can only assume because they give no details of numbers in each group or if any dropped out - looking at the 22% placebo response rate I think that either the two groups were not of equal sizes or there were dropouts because that figure does not give a whole number for number of responses if you assume a sample size of 30 - in the actual study they report p-values derived from Wilcoxon rank tests which doesn't make any sense given the data they present.
** I'll get back to these in a bit
*** IT would certainly be nice to know baseline scores because, since they report improvement over baseline, differences between the two groups in baseline scores (these happen by chance often in trials, particularly small ones like this) could lead to differences in improvement (say, because those who start out less severe have less opportunity to improve because their skin was pretty good already).
**** Obviously these are make believe stats since this would really be a cross-over design and I'm assuming independence, and because, obviously, you just can't make up placebo responses like this.
***** Would have been nice to be able to judge that by showing the 1m, 3m etc data.