Via badscience forums I see that there has been a placebo controlled randomised clinical trial of an anti-ageing creme that found a positive effect of the product - expect to hear about it:

Scientists say they have clinical proof that a face cream available on the high street does reduce wrinkles.

Five months' worth of stock of the leading brand sold in a day after Professor Chris Griffiths announced in 2007 it appeared to combat sun damage.

Two years on from the BBC Horizon programme showcasing his work, his team has shown the cream visibly smoothes out the skin.

Boots predicts boom sales of its No 7 Protect & Perfect Intense Beauty Serum.

Now it is certainly good that Boots has conducted a clinical trial, published in the British Journal of Dermatology no less, but it is, I'm afraid, bollocks.

I'm going to ignore everything in the paper other than the clinical trial as they are, to be blunt, irrelevant. So what did they do? Well they randomised 60 adults to placebo or the face cream in question to apply to their hands and face for 6m -and they looked at four measures, clinical scales for fine lines and wrinkles, dyspigmentation, and the overall clinical grade of photoageing and tactile roughness at baseline, 1m, 3m, and 6m.

So what did they find? Well they report that at 6m there was no statistically significant difference on any of the measures - including improvement in facial wrinkles (compared to baseline) where 43% who had used the product improved and 22% of the placebo group (that's a relative improvement of 1.89 times, 95% CI .86-4.0, with a p-value of .11*). As the paper says:

"the test product did lead to a noticeable clinical improvement in facial 43% of treated individuals after 6 months, compared with only 22% of those treated with the vehicle...In a comparison between groups, this improvement was not statistically significant" (my emphasis)
Huh? Yes, that's right, no differences. So why is this supposed to be a positive study? Well that's because they were rather sneaky, after 6m - at which point their were no significant differences remember - they stopped doing a blinded placebo controlled trial and put all the subjects on the face cream. They then extrapolated (using linear regression, and presumably the 1m, 3m, and 6m response rates) the placebo response to 'guess' what the 12m response rate might be. They found that 70% of the 60 people now getting cream had improved facial wrinkles (not improved hand wrinkles, or improved dyspigmentation, photoageing, or tactile roughness of the hands or face**) while they estimated that only 33% of the extrapolated placebo group would have improved.

Now we don't know anything about this regression because they don't tell us any data from baseline***, 1m, or 3m, but you might argue that 33% seems a rather low rate, and, since we might want any regression to go through 0,0 (since at baseline there can have been no improvement) and only have data from 6m presented we could suggest that 44% would be just as reasonable a 12m response rate for the placebo group.

This is an inherently dodgy way to go about analysing the data (and it gives free additional sample size artificially inflating the power of the study) which is now not even from a blinded randomised trial but instead a open label trial (everyone now knows they're on the cream, not placebo) but if we look at what the results give we might find that, assuming the placebo group is 30 (and, of course, that group doesn't really exist) and the cream group has 60 people we get a response rate of 70% for cream and 33% placebo (2.10 times relative improvement 95% CI: 1.23-3.58, p=.006****) - if we assume my 44% placebo response it is 1.62 times 95% CI: 1.04-2.51, p=.03).

However, we've also forgotten that they made a lot of statistical comparisons, we'll let them off the comparisons at 1m, 3m, and 6m (they're not independent anyway, if this had been a pain relief trial, say, they might have tried to make something of them if they had proven to be signficant early in the trial and then became non-significant later on - but that is unlikely here) but they did do 4 measures on each of the hands and face - that's 8 sets of statistical tests - so our error rate of p=.05 will get inflated with all those tests (which each have a 5% error rate) so we need to correct for that - a simple Bonferroni correction implies that we need to multiply the p-values by 8, which makes the 70%-33% comparison barely significant (p=.048) and the 70%-44% comparison non-significant (p=.24).

It is worth noting that although only 13/30 showed an improvement with 6m treatment when the placebo arm was added in and another 6m of treatment given, assuming the original 13 sustained their improvement, a whacking great 29 further people showed an improvement (i.e. we might think that a second 6m had the same response rate as in the first 6m, doubling the response rate for that first 30, plus an extra 30 people have that same 6m response rate). I find that pharmacologically unlikely*****.

Take home message - they did a randomised blinded clinical trial for 6m and found no statistically significant effects of the Boots cream (or even remotely nearly significant given the necessary multiple testing correction). They then did a non-trial where they essentially made-up placebo control group results and gave the cream to all the real patients in a non-blinded fashion. And then, maybe, they have a borderline statistically significant result.

The data is sufficiently badly presented, and given that the clinical trial is what, ultimately, they'll use to sell it, that I'd say they have deliberately done dodgy stats to hide the negative nature of the data. God knows what the Manchester researchers were thinking, and I despair of the British Journal of Dermatology and its peer review.

I wonder what the following were thinking when they said:

"Nina Goad of the British Association of Dermatologists said: "Approximately one in five people using the cream will get something extra for their money over plain moisturisers. "It is an interesting step forward in research although the long term benefits are unknown. "The main preventable causes of skin ageing are sun exposure and smoking, so if you're worried about wrinkles, limiting these factors is sensible."

Dr Nick Lowe, clinical professor of dermatology at UCLA School of Medicine, said: "The previous rapid study reported from this group measured fibrillin a substance that predicts the formulation of collagen. More collagen should result in skin rejuvenation. "This latest longer study over six months appears to confirm skin rejuvenation as measured by dermatology examination."

Dr Richard Weller, senior lecturer in dermatology at the University of Edinburgh, said: "This is, as far as I am aware, the first properly conducted placebo controlled, double blind trial of an over the counter cosmetic product. Boots are to be congratulated for doing this."

I wonder if they actually read it (it was published on the 28th - the same day as the BBC article - the media have a habit of asking for quotes before anyone gets to read the article).

NHS Behind the Headlines has also covered this story.

Acknowledgements to the observations of willowtree and BenFranklin in the thread.

We can see here from the press release from Manchester that there was specific misrepresentation:
"The study, published online in the British Journal of Dermatology today (Tuesday, April 28), showed that 70% of individuals using the beauty product had significantly fewer wrinkles after 12 months of daily use compared to volunteers using a placebo." (my emphasis)

* Maentel-Haenszel assuming 30 in each group - we can only assume because they give no details of numbers in each group or if any dropped out - looking at the 22% placebo response rate I think that either the two groups were not of equal sizes or there were dropouts because that figure does not give a whole number for number of responses if you assume a sample size of 30 - in the actual study they report p-values derived from Wilcoxon rank tests which doesn't make any sense given the data they present.

** I'll get back to these in a bit

*** IT would certainly be nice to know baseline scores because, since they report improvement over baseline, differences between the two groups in baseline scores (these happen by chance often in trials, particularly small ones like this) could lead to differences in improvement (say, because those who start out less severe have less opportunity to improve because their skin was pretty good already).

**** Obviously these are make believe stats since this would really be a cross-over design and I'm assuming independence, and because, obviously, you just can't make up placebo responses like this

***** Would have been nice to be able to judge that by showing the 1m, 3m etc data.


teekblog said...

lovely analysis...! Saw a headline related to this in Metro, glad you've pulled it apart - sad fact is that there may well be products out there that do in fact protect the skin but with research and reportage of this quality we'll never know...!!


Neuroskeptic said...

Ugh. What a horrible methodology.

Do they try to explain why they decided to make the trial open-label after 6 months? I mean, obviously the answer is "because we had to find a result somehow" but I assume they didn't write that...

pj said...

Hmm - think I've put the wrong link in for the article - it is available free online here (I can't edit my blog at work).

The authors give absolutely no indication as to why they gave all subjects the cream in the last 6m (one might guess the reason was to increase the number prepared to give a skin biopsy), all they say is:

"As all volunteers used the test product in the final 6 months, the 12-month clinical assessment data were analysed using a combination of Wilcoxon's matched pairs signed rank and rank sum tests, to give an overall P-value."In their discussion they rather misleadingly state:

"The trial was executed to the highest standards, with study creams coded and randomized at source, and with the volunteers, investigators and independent statistician 'blind' to the coding until after study completion and initial data analysis.Which seems unlikely given that they gave all subjects the cream in the second 6m - and anyone involved in the trial design would have known this - so it wasn't 'blind'.

pj said...

Oh, I do seem to be able to edit it now...

Neuroskeptic said...

What a rubbish paper.

P.S To edit your blog you have to be logged in as pj, and then you have to click "sign in" in the top left. Although the latter stage can be skipped if you post a comment, I think that's why it sometimes seems to work and sometimes doesn't.

Nothing to do with where you're logged in from AFAIK.

Neuroskeptic said...

Top right I mean.

pj said...

I can't 'login' on the top right from work (possibly deliberate policy re: IT use) but I can login in comments and then edit.

Rubbish, but I'm wondering whether that's incompetence or deliberate. You can't rule out incompetence in analysis even nowadays (if I had more time i might look at some of their other papers - but they do say that a statistician was involved - that'd be embarrasing for said statistician you'd think) but I think a deliberate attempt to massage the data and even mislead would be more likely.

pj said...

"Independent statistical analysis of the RCT phase of this study was provided by Chirostat Statistical Consulting, Nottingham, U.K"Take a bow there Eddie.

Neuroskeptic said...

Right, but the hired SPSSer can't have been responsible for the decision to turn the trial open label after 6 months.

pj said...

No - to be fair he was probably hired to find some way of making the data publishable and look statistically significant.

I am unimpressed by the statistical analysis though I must say.

But if I was going to design a trial to try and maximise the positive headlines to sell my product I might well have designed it like this.

And it looks like Boots have succeeded - the headline and story is unanimously that the clinical trial showed that the Boots anti-ageing cream worked - rather than the actual and opposite finding.

Maybe Ben Goldacre will pick up on it - but I think it is now too late: Boots 1 Science 0

willowtree said...

I notice that the independent statistical analysis was done by an ex-employee of Boots. I'm not sure in what manner that counts as being independent.