It’s census time again. I’ve made several criticisms of the process in the past, at first in an e-mail list and subsequently in two of my books, and I hate repeating myself. Checking my previous posts, however, I find that I haven’t made any of those comments on the blog, so it makes sense to cycle through them now.
The first thing is to acknowledge that something like a census is essential. The census gives us the denominators – the numbers that go below the dividing line. Without those numbers, we can’t tell how big a problem is, or how evenly it’s distributed.
‘Something like’ a census, however, doesn’t mean it has to be this census. There are major problems with the UK census, and most of them are avoidable. I’ve heard Ian Diamond, who’s now the head of the Office for National Statistics, lecturing on this, and he emphasises the importance of arriving at ‘one number’. The preservation of the census in this form is its greatest weakness.
If we look around the world, we’ll find that Britain’s reliance on a one-number census is increasingly unusual. They’re used in southern Europe, but in other countries they use different techniques. The USA has the American Community Survey, a large geographic sample of about 3% of the population, to get the fine detail. Germany has a 1% micro-census and a range of information from administrative sources. The Nordic countries are using registers of information. France has a rolling census, allowing for regular updates.
The British census is too long, too complex, too unwieldy, and too slow. If we go back to the historic archives, we can see what the census used to be: a literal count of the persons in each household, comprising names, ages and addresses. That, frankly, is all we need from a single census – and as much as we can handle. The English census has 51 questions. Every added question is another hole in the ship.
The first problem is time. It generally takes two years to get the first results from the census, and, because it’s such a massive exercise, it has to last us for 10 years after that. (We last had a mid-term census in 1976.) That means, simply enough, that the census is always out of date – by at least two, and up to twelve, years. If we want, for practical purposes, to do something useful, like setting up a primary school, the census is at best a rough, rusty guide. Rolling data would be much better for the purpose.
The second problem is accuracy. Statisticians have probably learned that as the numbers get bigger, they become more reliable – that we can be ‘more ‘confident’ about the findings. This is just not true, at least not in the real world. What happens with big numbers is that mistakes and biases are amplified, and we are liable to invest the numbers with meaning when they may have none. I think people will remember, for example, the large number of write-ins claiming to be members of the Jedi religion; at least we can tell that’s bogus. It’s more difficult to pick holes in other responses, but we should be able to acknowledge at least that we couldn’t rely on previous censuses to get the numbers of young men right. If this census gives us an accurate, reliable count of people who are disabled or those whose gender is non-binary, I for one will be astonished.
If we really want to know about these topics, the census won’t give us the information. That is going to rely on much more detailed work, probably with a qualitative component to clarify what the answers actually mean. That leads me on to social science, and finding better ways to do things. Numbers about society are indicators – that is, signposts or pointers. We do not need accurate counts of everything; we need to have enough to prepare samples, which we can look at in more depth. The census provides us with the sort of information we need to work out how to make a sample. The great mistake is to suppose it can do more than that.