A confidence interval allows the analyst to replace a single value by a set of likely ones and thus assess properly the precision of a statistical estimate. We tell you here how this actionable tool is working.
Some headlines of the day of this writing (early April 2014):
– 41% of French people trust the new PM, Manuel Valls,
– In Lebanon, Syrian refugees now account for 25% of the population,
– Households’ confidence surged a little bit ahead in February, with a 3 points increase of the synthetic index.
Lots of exact figures, conveying a reassuring feeling of accuracy, while all of them are derived from an estimation process, and thus include a random component. .
Replacing a single figure by a range of likely figures is thus a promising idea: by allowing the public to assess the probable error margin, the delivered information is richer, more nuanced and at the end more useful.
For example, the first sentence of the French Statistical Office (INSEE)’s communiqué published end of March reads: « in March, household confidence is increasing ». But is it quite possible that the 3 points increase is not significant and only comes from standard random fluctuations in a sampling process. Stating that households’ confidence is increasing or is stable is not at all the same piece of information: hence the importance and actionability of confidence intervals.
In statistical terms, we need to replace a point estimate (a single figure) by a range estimate (a confidence interval).
In this article, we suppose we have at least 100 respondents. A statistician would say we are working asymptotically. Below this threshold anyway, statistics doesn’t have much to tell. We will also assume the reference population is much bigger than the sample from which we have calculated our figure of interest.
We need three ingredients to compute a confidence interval:
– The point estimate around which we want to compute the interval,
– The variance of that point estimate: as it is derived from an estimation process, it can be associated to a variance, which measures the precision of that process. More precisely, we need the square root of the variance, i.e. the standard deviation,
– A confidence level, which measures the probability that the true value we seek to estimate is in the confidence interval: thus, the probability of being right/wrong.
Once you have these three ingredients, the calculation is very simple.
Confidence intervals are always computed in the same way:
Point estimation +/- 1,96 standard deviation
The 1,96 coefficient is used for a 95% confidence interval. Other values would be used for other confidence levels. We will come back to that.
Suppose for example that the variance of the household confidence 3 points increase is equal to 4 (we explain here why this value might be possible). The standard deviation is thus 2.
With a 95% confidence level, the confidence interval around the measured single figure 3 is:
(3-2*1,96=-0,92) [-0,92 ; 6,92] (3+3*1,96=6,92)
The confidence level is interpreted like this: if we were to ask everybody, we would measure the true value of the households’ confidence variation. But we don’t ask everybody, only 2000 of them. Deriving from a sampling process, a 95% confidence interval includes this true value with probability 95%.
In the matter at hand, 0 is in the confidence interval. Thus, if the assumption on the variance proved to be correct, the available data would not fully support the idea that households’ confidence has indeed increased. A statistician would say that he cannot reject the hypothesis that households’ confidence remained stable, with a 5% risk of being mistaken.
When dealing with a percentage, computing the confidence interval is simpler. Indeed, the variance of an estimated percentage can directly be calculated from that percentage. If p is the estimated percentage and we have N respondents, the variance is equal to p*(1-p)/N. The below table gives a few examples of the computation of the standard deviation, and thus of the 95% confidence interval, around a percentage for a sample of 1000 respondents:
Percentage | Standard deviation | 95% confidence interval |
10% | 0,009 | [9,1% ;11,9%] |
30% | 0,014 | [ 27,2%;32,8%] |
45% | 0,016 | [41,9%;48,1%] |
50% | 0,016 | [46,9%;53,1%] |
55% | 0,016 | [51,9% ;58,1%] |
70% | 0,014 | [67,2% ;72,8%] |
90% | 0,009 | [89,1% ;[91,9%] |
This is how error margins are calculated for commercial polls (see the article on electoral polls). As can be seen from the table, results are symmetrical around 50%: the confidence interval for an estimated percentage of 30% is identical to the one for 70%.
Getting the shortest possible confidence intervals will always be the objective of any data collection and analysis process. What can we do about that?
Two dimensions can be acted upon: the standard deviation of the estimate and the confidence level of the interval:
– The simplest way to decrease the standard deviation is to increase the number of cases (the sample size in the context of a survey). This can be seen directly from the above formula on percentages, and it is the same for any type of data. Nothing better that having more data to get more precision…
o A well thought about sampling scheme can possibly help reduce the standard deviation: the variance can be significantly decreased by stratifying the sample. This will however mainly work for quantities (number of products bought, income…) rather than percentages.
– Decreasing the confidence level also allows you to shorten the confidence interval. Let us keep in mind that the confidence level is a measure of the probability that the true value is in the interval. The larger the interval, the greater this probability (the interval spans over more values). Reducing the confidence level will decrease the length of the interval, but will also increase the risk that the true value is not in the interval: this is the price to pay to be more interesting.
The table below gives the limits of 90%, 95% and 99% confidence intervals, around the increase of household confidence we discussed above (we stick to the assumption that the variance is equal to 4).
Level | Coefficient | Lower limit | Upper limit |
90% | 1,64 | -0,28 | 6,28 |
95% | 1,96 | -0,92 | 6,92 |
99% | 2,58 | -2,16 | 8,16 |
When the level is increasing, the interval lengthens: there is more chance the true value is covered. But the conveyed information is fuzzier and less useful.
A last note on the matter of whether the supposed increase of the households’ confidence: even at 90%, it is not significant.
Lots of confidence intervals with the same confidence level can be built around the same measured point value. All confidence intervals below have a 95% probability of encompassing the true value (same example of the supposed increase of French households’ confidence in March 2014):
[-0,92 ; 6,92] [-0,28 ; +∞[ ]-∞ ; 6,28] [-1,66 ; 6,52] [-0,52 ; 7,66]
Only the first interval is symmetrical around the measured point value (3). That interval has a nice property: it is the shortest one among all possible confidence intervals with the same confidence level.
This is why confidence intervals are always symmetrical: symmetry ensures that the interval has the shortest length. There is also a statistical rational for choosing symmetrical confidence intervals, as is explained in the article on statistical tests.
Questions over the precision of a statistical measurement often boil down to performing a statistical test: is the March 2014 increase of households’ confidence real or just a statistical artefact? Was the gap between the two candidates, as given by polls on the eve of the last presidential election, statistically significant? Is the memory of my advertising campaign significantly higher than with other similar campaigns, i.e. is my campaign efficient?
A statistical test can straightforwardly be interpreted in terms of confidence intervals: the two concepts are equivalent. Understanding what is a confidence interval being rather easy, the duality confidence intervals/statistical tests help better grasp the latter concept, which can at first sight seem rather complex. See the article on statitical tests for more details.
PSG won the first leg of its 2014 Champions’ League quarter final against Chelsea 3-1. Pundits tell us that the probability of PSG going to the semi-final is 75% (btw, they didn’t). What is the confidence interval around that figure?
S. Kullback (1959): Information Theory and Statistics – Wiley
T.S. Ferguson (1967) : Mathematical Statistics – Academic Press
J.P. Lecoutre (2012) : Statistique et probabilités – Dunod
A. Monfort (1982) : Cours de statistique mathématique –- Economica
S.D. Silvey (1975) : Statistical inference – Chapman and Hall
The information given by the French statistical office (INSEE) is summarized below:
2014 | |||||
Av. | |||||
Synthetic Index | 100 | 85 | 86 | 85 | 88 |
Personal financial situation – past evolution | –19 | –34 | –35 | –32 | –30 |
Personal financial situation – perspective | –4 | –20 | –17 | –19 | –17 |
Current sparing capacity | 8 | 11 | 14 | 10 | 16 |
Future sparing capacity | –10 | –5 | –1 | –7 | 2 |
Opportunity to spare | 18 | 18 | 23 | 21 | 20 |
Opportunity to spend | –14 | –29 | –28 | –28 | –26 |
Standard of living – past evolution | –43 | –73 | –71 | –72 | –69 |
Standard of living – perspective | –23 | –49 | –46 | –51 | –47 |
Unemployment – perspective | 32 | 49 | 53 | 55 | 53 |
Prices – past evolution | –13 | –7 | –13 | –20 | –25 |
Prices – perspective | –34 | –17 | –16 | –24 | –30 |
We only have the balance between positive and negative answers and the sample size (around 2000). The synthetic index is a weighted average of the 11 KPIs, but the weights are not public.
The public does not have thus readily at hand the necessary components to calculate the precision of the published figures. Some assumptions can be made and would lead to the following table:
March to Feb | Variance | Standard deviation | Different from 0 | |
Personal financial situation – past evolution | 2 | 5,0 | 2,2 | No |
Personal financial situation – perspective | 2 | 4,5 | 2,1 | No |
Current sparing capacity | 6 | 4,5 | 2,1 | Yes |
Future sparing capacity | 9 | 6,0 | 2,4 | Yes |
Opportunity to spare | -1 | 5,0 | 2,2 | No |
Opportunity to spend | 2 | 5,0 | 2,2 | No |
Standard of living – past evolution | 3 | 3,5 | 1,9 | No |
Standard of living – perspective | 4 | 5,0 | 2,2 | No |
Unemployment – perspective | -2 | 5,0 | 2,2 | No |
Prices – past evolution | -5 | 5,0 | 2,2 | Yes |
Prices – perspective | -6 | 5,0 | 2,2 | Yes |
In the above table, you can find:
– The variation between February and March of each KPI,
– The variance of that variation. Here we need to make some assumptions: this variance is calculated as the average over various possible sets of positive and negative answers,
– The standard deviation of this variation,
– And the fact that 0 is in the confidence interval, i.e. that the measured variation is statistically significant.
It can be seen that, for 7 KPIs over 11, the variation is not significant. It is significant for 4 KPIs: 2 increases and 2 decreases. The average variance is 5.
It would be possible that the combination of 11 KPIs would give a significantly positive variation, when only two of them show a significantly positive one. This would certainly be the case if the KPIs were not correlated. In that case, the sample size is in some sort leveraged. But in our case, the KPIs are highly correlated. The variance gain, when you aggregate them, is most probably small. This is why a variance of 4 for the index variation seems likely to us. In order to calculate it exactly, we would need to have access to all data.