Thursday, September 24, 2009

The new HIV vaccine:- some stats

From the BBC website:

-- 8,198 people took the placebo and 74 were infected with HIV during the trial.

-- 8,197 people took the vaccine and 51 were infected.

Now let's crunch some numbers.

Start with those people receiving the placebo, the unprotected ones. They suggest the probability of getting HIV during the course of the trial was p = 74/8,198 = 9/1,000.

A way of thinking about this is that each person had a dice with 111 sides, and they rolled it once during the trial. If it came up a '1' they were infected: any other number they were clear.

Now we consider the distribution of the total number of '1's obtained (i.e. the total number infected). It's a binomial distribution with number of trials n = 8,198, mean np = 74 and standard deviation σ = √(np) = √74 = 8.6.

Our null hypthesis is that the vaccine is useless, and so the fact that only 51 people were infected when using it was just the luck of the draw. How likely is that?

(74 - 51) = 23 = 23/8.6 std devs = 2.67 σ.

Turning to our tables of the normal distribution, the chances of getting this far out or beyond is only 0.38%.

So yes, they are definitely seeing something. But what?

Now we look at the vaccinated population. Their probability of getting HIV appears to be p = 51/8,197 = 6.22/1,000.

But how sure can we be of this? Perhaps the vaccine is not so good but we got lucky? Or maybe the vaccine is super-good but we still had a bad result by chance?

The random variable here is the chances of getting ill given you had the vaccine. Let v be the number of people who were vaccinated and then got infected. In the trial v was 51.

Now we ask, in what range could v have varied so that within a 99% confidence interval the value 51 might have occurred?

This is the range v plus or minus 1.96 standard deviations - note that the standard deviation σv = √(npv) = √v.

So we ask, for what values of v is it the case that:

1. v - 1.96 σv = 51

2. v + 1.96 σv = 51

It turns out that case 1: v= 67 and case 2: v= 39 as compared with the placebo group of 74.

So this is telling us that we can be 99% confident that after vaccination of 8,197 people, somewhere between 39 and 67 people could expect to be infected ... as against 74 infected without the vaccination. Pretty wide error bars!

How much better is the vaccine than the placebo? The published report said 31.2% better. They got this from the calculation: (74 - 51)/74 = 31.2%.

Sounds good. However, with 99% confidence we can only say that the vaccine is somewhere between (74 - 67)/74 = 9% better and (74 - 39)/74 = 47% better.

So 31.2% better? Or somewhere between 9% and 47% better?

You decide what to do next!