A report published by Policy Exchange seeks to defend right-wing academics against the suppression of their academic freedoms. Their cause is open to question, and I’m not sure that I should be bothering with a report that has been described as ‘methodologically abysmal‘, but I’m intrigued that there’s so little understanding of basic research methods on both sides of the argument. On one hand, we have this somewhat inept explanation in the report itself:
The sample consists of 820 respondents (484 currently employed and 336 retired; average age of current academics is 49 and of those retired is 70). Given the approximately 217,000 academic staff working in British universities in 2018-19, our sample is proportionately many times larger than a conventional opinion survey (typically a sample of 1,500 across a national population of 60m). As such our data has a good claim to being representative of the wider academic population even though, as with all opinion surveys, there is a margin of error in the results.
A survey isn’t made more representative simply by being larger. There are potential biases in the inclusion of a hefty proportion of retired academics and the assumption that non-responses (from page 51, 24% to 39% of the totals) don’t skew the results .
On the other hand, we have the combative response of Jonathan Portes, who comments that this would fail any basic undergraduate course on statistics. Well, he’s right that their argument is based in bad statistics. The reporting of the methodology and the questions isn’t systematic or complete. The size of a sample does not make it representative, and making it bigger does not make it more representative, it only magnifies the bias. But of course this sort of thing wouldn’t actually fail a project, because undergraduate projects are judged by what they do, not just by how sound they are. I’m also troubled by Portes’s dismissal of ‘dubious anecdotes‘, the common complaint of those who believe in the inherent superiority of numbers. What is the difference between ‘anecdotes’ and responses that can be counted? Why is richer, fuller evidential material less credible than ticked boxes? Qualitative research studies do the same kind of thing that is done in the courts: they look for evidence, and they look for corroboration of that evidence. The ‘anecdotes’ in most research studies, including this report, are the bits that really matter. Additional note, 13th August: Jonathan Portes has written to me to clarify that he was intending to challenge accounts that he thought were ‘fabricated’, rather than the validity of using anecdotes.
In the course of my career, I’ve taught research methods for about twenty years. I’ve often found that neophyte students come to the subject with preconceptions about what research evidence ought to look like: ideally there should be numbers, and clear categories of response, and statistics, and statements about representativeness. That seems to be the attitude that has prevailed here. The basic questions we need to ask, however, are not about statistics. They are, rather, a question of what makes for evidence, and what we should make of the evidence when we have it. The Policy Exchange report tells us openly that it was looking for corroboration of problems experienced in a small a number of widely reported incidents – that’s the background to their report, in Part 1. Their sample consisted of academics and retired academics registered as respondents on Yougov. There may have been some statistical biases in that process, and it’s possible that the retired academics may have answered differently to others; we do not have enough information to tell.
Their respondents pointed to a range of issues. The questions they ought to have asked about their data, then, was not ‘is the sample big enough?’, or even ‘how representative is this sample?’ but ‘what does the evidence tell us about the issue we are looking at?’ The first thing you can get from a survey like this is a sense of whether there’s an issue at all. The second is whether there is corroboration – whether different people, in different places, have had related experiences. There’s some limited evidence to back that up -there are contributions from a handful of right-wing academics, but the report also indicates that there is a small but identifiable element of political discrimination across the spectrum. (I’ve encountered that myself: I have been rejected more than once for jobs because the external assessor at interview objected to something I’d written about poverty.) Interestingly there is little in the survey relating to more extreme examples, and ‘no platforming’ hardly appears as a problem. The third is whether we can discern patterns of behaviour. That’s more difficult to judge, and it’s where information about extents might have been helpful; the main pattern the report claims to identify is a ‘chilling effect’, that people who are fearful of consequences tend to alter their behaviour to avoid the potential harm. That’s plausible but not conclusive.
The two main weaknesses in this report, in my view, are not about statistics at all. The first rests in the bias of the design. The questions asked people tendentiously about right-wing causes such as multiculturalism, diversity and family values. An illustrative question:
If a staff member in your institution did research showing that
greater ethnic diversity leads to increased societal tension and
poorer social outcomes, would you support or oppose efforts by
students/the administration to let the staff member know that they should find work elsewhere? [Support, oppose, neither support or oppose, don’t know]
I suppose my immediate reaction would be that anyone who claims to ‘show’ a clear causal link between complex and unstable categories of behaviour, rather than ‘argue’ for an interpretation, hasn’t quite grasped the nature of social science. (The same criticism would apply to someone claiming to prove the opposite.) But the questions that people ask often reveal something about the position of the team that’s asking, and this is the point at which, if I’d been asked, I’d probably have stopped filling in the questionnaire. (I wasn’t asked. I was removed some years ago from the Yougov panel after I objected to the classification of racial groups I was being asked to respond to. I got a formal letter from Peter Kellner telling me my participation was no longer required.)
The report’s other main weakness lies in its political recommendations, centred on the appointment of a national Director for Academic Freedom. I couldn’t see any clear relationship between the proposals for reform and the evidence presented.