At the recent Printemps des Etudes gathering (the annual hobnob of French Market Research), SLPV analytics hold a conference entitled: Can we still forecast? Yes, we can, but by sticking to some basic rules: good quality and representative data, analytics adapted to the issue and data, tackling the issue from different angles, …
In that intervention, we argued that there was first a need to a clear diagnostic on forecasting errors. The Brexit/Trump/French primaries sequence might have led to the belief that there is a structural issue with survey-based forecasting. Actually, the difficulties, and the nature of the possible forecasting errors, are different each time and should lead to a nuanced assessment. This depth in the analysis is a necessary condition for a better understanding of the issues, and finally of actionable solutions.
I will come back to that in a next blog.
Someone in the assistance asked a very relevant question. A 45’ conference had allowed her to understand the need for that nuanced assessment. But is it possible to convey to the general public – which has many other issues on its mind – fine nuances across situations that look alike at first glance, when there is little time to spare on their analysis?
Vast question, to which I cannot pretend I have an answer. At SLPV analytics , we believe it is possible to explain in simple terms what might seem complex. For that however, there is a need for those who make a profession of science popularisation not to generate more confusion.
This tweet from Nate Silver is a good example of that confusion. Its objective is to show that there is no issue of under estimation of populist parties in Europe. What he calls “Shy Voters” issue, as would English speaking medias.
Let us first recall why that « Shy Voters » labelling really is a bad concept. The main issue with surveys is non response. Non-response has multiple roots. It is indeed possible that some voters don’t dare to tell for whom they vote, and that they would then lie, or don’t give an answer: as I already noted in a previous blog, for that to bias the results, respondents should be ashamed of their current vote, but not of their past one. Why not, but a bit complicated.
A much simpler explanation is selection bias: the propensity to answer surveys depends on the characteristics of the interviewee, and in particular of the variable we want to measure. Simpler explanation, and also more universal: it covers all possible reasons for non-response, without making use of pseudo sociological polish. And then, the scientific literature explains how to deal with that issue….
Let us come back to the tweet. There would be no issue of “Shy Voter Le Pen”. This is obviously wrong. Voting intention for Marine Le Pen are under estimated in the collected raw data (because her electoral base has a smaller propensity to answer surveys). Hence the necessity for a reweighting of the data. If the polls figure match with actual ones, this is because pollsters work hard on their reweighting. Hard and well.
Why does it all matter? Secondarily, because it is a annoying experts spread fake news. But mainly, because of the confusion it generates. If there is no non-response issue, how come can polls be sometimes wrong (and they will be again at some point)? Why is there a need for a reweighting? Could everybody not produce their own poll, without a need for proven expertise and experience?
The inaccuracy of experts is a source for conspiracy theories, as Nate Silver himself brilliantly demonstrated about IPCC’s climatologists. It is a pity he does not apply to himself what he recommends to others.